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In  1987,  in  his  paper  "Bidiagonalization  and  Diagonalization,"  William  W.  Hager 
presented  an  algorithm  to  diagonalize  a  matrix  with  distinct  eigenvalues.  To  imple- 
ment the  algorithm,  we  need  a  good  approximation  to  the  matrix  of  eigenvectors. 
First,  we  present  the  derivation  of  his  algorithm,  and  then  we  show  how  it  can  be 
generalized  to  diagonalize  a  matrix  with  multiple  eigenvalues.  A  local  quadratic  con- 
vergence result  is  established  for  the  generalized  algorithm.  When  a  good  starting 
guess  for  the  eigenvectors  is  not  known,  Hager's  algorithm  may  not  converge.  To  re- 
store convergence,  we  modified  the  algorithm  by  replacing  diagonalizations  by  Schur 
decompositions.  We  found  that  if  uniformly  distributed  numbers  on  a  circle  in  the 
complex  plane,  which  includes  all  Gerschgorin  disks  for  the  matrix,  are  used  to  ap- 
proximate the  eigenvalues,  and  if  the  eigenvectors  are  approximated  by  the  columns 
of  the  identity  matrix,  then  rapid  convergence  is  obtained.  We  presented  some  node 
programs  which,  along  with  some  modifications  to  the  last  algorithm,  can  be  used  in 
a  distributed  memory  machine,  where  processors  are  interconnected  in  a  ring.  Next, 


we  present  additional  modifications  of  Hager's  algorithm  and  the  generalized  algo- 
rithm that  take  advantage  of  matrix  symmetry  and  reduce  the  given  matrix  into  a 
diagonal  matrix  with  eigenvalues  along  the  diagonal. 


VI 


CHAPTER  1 
INTRODUCTION 

Let  A  be  an  n  x  n  matrix.  We  want  to  find  an  iterative  method  to  compute 
a  diagonalization  A  =  XAX-1,  provided  it  exists,  where  the  columns  of  X  are 
eigenvectors  of  A,  and  A  =  diag  (  Ax,  A2,  ...,  A„  )  whose  diagonal  elements  are  the 
eigenvalues  of  A.  Here  we  will  use  a  sensitivity  result  for  eigenpairs  to  diagonalize  a 
matrix.  We  assume  eigenvalues  of  A  are  all  distinct.  In  the  derivation  of  necessary 
equations  to  diagonalize  a  matrix,  we  need  the  continuity  of  eigenvalues,  and  the 
perturbation  theory  based  on  Gerschgorin's  theorem.  We  give  the  statement  of  the 
theorem  below: 


Theorem  1.0.1  If  A  £  CnXn ,  Pl  =  £  \aiJ\,  and 

Di  =  {zeC:  \z-alt\  <Pl}, 
then 

X(A)  C  ur=1A. 

Furthermore,  if  a  set  of  k  Gerschgorin  disks  D,   are  isolated  from  the  other  n  -  k 
disks,  then  their  union  contains  precisely  k  eigenvalues  of  A. 

Proof  of  Theorem  1.0.1  For  proof,  the  interested  readers  are  referred  to  [Atk89,  The- 
orem 9.1,  page  588]. 

Let  A  be  an  n  x  n  matrix,  whose  eigenvalues  are  all  distinct,  and  let  X~x AX  = 
diag(  Ai,A2,...,Aj,...,An  )  be  a  diagonalization  of  A.    Then  Axj  =  XjXj,  where 

1 


Xj  is  the  jih.  column  of  X.  Given  an  arbitrary  matrix  E,  and  for  a  sufficiently 
small  number  e,  we  will  show  in  the  next  theorem  that  there  exist  continuously 
differentiable  functions  Aj(e),  and  Xj(e)  with  A_,(0)  =  A_,,  and  Xj(0)  =  Xj  such  that 
{A  +  eE)xj(t)  =  Xj(e)xj(e).  That  is  (Aj(e),x,(e))  is  an  eigenpair  of  A  +  cE. 

Theorem  1.0.2  Let  A  be  an  nxn  matrix.  Assume  the  eigenvalues  of  A  are  all  distinct. 
Suppose  X~*AX  =  diag(  Ai,A2,...,An  )  is  a  diagonalization  of  A.  Then  given  an 
arbitrary  nxn  matrix  E,  and  for  a  sufficiently  small  t,  there  exist  continuously 
differentiable  functions  Aj(e),  and  Xj(e)  with  A;(0)  ==  A; ,  and  Xj(0)  =  Xj  such  that 
the  following  matrix  equation  holds: 


{A  +  eE)xj{t)  =  XJ(e)xJ{e). 


(1.1) 


Proof  of  Theorem  1.0.2  Let  A'  l  AX  =  diag(  Ai,  A2,. . .,  Aj,. ..,  An)  be  a  diagonaliza- 
tion of  A.  Let  E  be  an  n  x  n  matrix,  and  e  be  a  small  number.  Consider  the  matrix 
D  =  X~1(A  +  tE)X.  Then 


dij  = 


A j  +  e/jj       for  i  =  j, 


efij  for  i  ^  j, 

where  ftJ    =    yjEx3,    yj   is  the  ith  row  of  A'-1.      Let   B    =    diag(  1,1 ,e/k, 

. . . ,  1  ),  and  let  C  =  BDB~\  then 


C  = 


Ai 


+ 


e/21 


e/12 
e/22 


fc/ii 
fc/2j 


ffl      t/»3 


c/nl  e/n2 


^ 


33 


*/. 


nj 


t/ln 
e/2n 


f/nn 


Applying  Theorem  1 .1).  1  to  C,  we  find  that  the  ;th  eigenvalue  lies  in  the  circular  disk 


r2     n 


with  center  (A_,  +  efa),  and  radius  r,  =  —  ^}  |/i»|i  provided  e  is  sufficiently  small. 


and  k  is  chosen  so  as  to  make  the  jth  Gerschgorin  disk  disjoint  from  the  other  n  —  1 
disks.  We  may  choose  a  value  of  k,  which  is  independent  of  t  in  the  following  way. 
Choose  k  to  have  the  largest  value  consistent  with  the  inequalities: 

\kfa\  <  -\Xj  -  A,|,  i  ^  j,  «  =  l,...,n. 

We  achieve  the  above  inequality  if 

-  A 

,  i^j,  i  =  l,...,n. 


mm 


Xj  —  A, 


With  the  above  value  of  k,  the  jth  disk  will  be  isolated  from  the  other  n  —  1  disks,  the 
radius  of  the  jth  disk  will  be  of  order  e2,  and  the  perturbation  in  the  jth  eigenvalue 
will  be  of  order  e  as  e  — >  0. 

Next  we  discuss  the  effect  of  perturbations  in  the  elements  of  A  on  the  eigen- 
vectors in  its  diagonalization.  As  earlier,  let  X_1AX  =  diag(  Ai,  A2, . . . ,  Aj, . . . ,  An  ) 
be  a  diagonalization  of  A.  The  columns  X\,X2, . . .  ,xn  of  X  form  a  complete  set  of 
eigenvectors.  We  denote  eigenvectors  of  A  +  tE  by  £i(e),  x2(e), . . . ,  xn(e),  and  eigen- 
vectors of  D  —  A'-1  (A  +  tE)X  by  v\(t),v2(t),  ■  ■ .  ,vn(t)  such  that  x;(e)  =  Xvi(e). 
Since  (Xj(t),Vj(e))  is  an  eigenpair  of  Z),  and  Aj(0)  =  Aj,  so  we  must  have  Vj(0)  —  ej, 
which  is  the  jth  unit  vector.  Let  Vj(e)  be  normalized  so  that  its  largest  element  is  1. 
For  sufficiently  small  e,  claim  that  the  jth  element  of  Vj(e)  is  1.  If  not,  let  Vij(e)  =  1 
for  some  i,  i  7^  j.  Since  Dvj(t)  =  \j(e)vj(e),  so  equating  the  zth  component  on  both 
sides,  we  get 

n 

Xj{e)v,j{e)  =  \iVij(e)  +  e^2  fiivuie),  (1.2) 

that  is 

n 

\j{e)  -  \t  =  e^2fuvij(e). 

1=1 

Now  letting  e  — +  0,  we  find  Xj  —  A,  =  0,  i  7^  j,  a  contradiction.  So  for  sufficiently 
small  e,   Vj:(t)  =  1.    Next,  we  will  show  that  the  other  components  of  Vj(e)  are  all 


less  than  Nt  as  e  — >  0  for  some  TV  >  0.  Since  |«ij(c)|  <  1,  so  from  (1.2),  we  get 

|Aj(«)-*IK-(e)l  <«£!/«!■ 


/=i 


Let  | Aj ( e )  -  A,|  >  -|Aj  -  A,|  for  i  ^  j,  i  =  l,...,n,  then  |»y(c)|  <  Are,  where 


iV  = 


max 


2£|/«| 


Thus  Uij(e)  are  of  order  e  for  i  7^  j,  i  =  1, . . . ,  n.  Also  from  (1.2),  we  get 

n 

(Aj(e)  -  Ai)wij(e)  =  e/y  +  e^2fuvu(e). 


1=1 


The  second  term  on  the  right  is  of  order  e2,  so  for  i  ^  j,  i  =  1, . .  .,n,  we  have 


"o(c) 


=  0(O. 


A,  -  A, 

Hence  Aj(e),  and  Xj(e)  are  differentiable  functions  of  e,  and  their  first  derivatives  are 
continuous.  This  completes  the  proof  of  the  theorem.  □ 

Now  we  are  ready  to  derive  the  necessary  equations  to  find  a  diagonalization  of  a 
matrix  A,  whose  eigenvalues  are  all  distinct. 

Let  (Aj,  Xj)  be  a  simple  eigenvalue,  and  corresponding  eigenvector  of  .4.  Given  an 
n  x  n  matrix  E,  and  for  a  sufficiently  small  e,  let  Xj(e),  and  .Tj{e)  be  differentiable 
functions  of  t  such  that  A;(0)  =  \:,  and  Xj(0)  =  Xj.  Differentiating  (1.1)  with  respect 
to  e  and  putting  t  =  0,  we  have 


Ax'^+Ex^X'^xj  +  Xjx'jiO). 


(1.3) 


Since  the  columns  x\,  x2,  . . . ,  xn  of  X  form  a  complete  set  of  eigenvectors  of  A, 
we  can  expand  Xj(e)  in  the  following  way: 


xj(e)  =  5Z*y(c)*i- 


(1.4) 


i=l 


Since  a  scalar  multiple  of  an  eigenvector  is  again  an  eigenvector,  we  can  take  bjj(e)  =  1 
without  loss  of  generality. 

Now  differentiating  (1.4)  with  respect  to  e  and  setting  e  =  0,  we  have 

*S(o)  =  XX-(o)*.-.  (i-«) 


1=1 


Substituting  (1.5)  into  (1.3),  we  obtain 


A  J2  b'lJ(0)x1  +  Ex,  =  A;(0)x,  +  \j  J2  b'aWxi 

■=1  1=1 


£  b'^XiXi  +  EXj  =  fyOfa  +  £  W*i*i 

.=i  >=i 

eX]  =  a;.(o)xj  +  it  b'ij(°)  (Ai  -  A«)  *•■■  (L6) 


1=1 

•/J 


To  obtain  expressions  for  Aj-(0),  and  bjj(0)  we  use  relations 

T  I    0    for  i  /  &, 

y'  Xfc  =  \  1     for  i  =  k, 

where  yf  is  the  ith  row  of  X-1.  So  premultiplying  (1.6)  by  yf,  and  yj,  yields 


(1.7) 


AJ(0>.=  yjfl*;.  (1.8) 


vT Ex 
yf^- =  6^(0)^- -  Afc)      =►      b'kj(0)  =  j±-fk 


With  these  values  of  b'kj(0),  (1.5)  becomes 

■=i  ^j  ~~  Ai 


n    vTEx  ■ 

1=1 


A  first  order  Taylor  expansion  gives 


\3{e)    «    A^O)  +  eA}(0), 
x,(e)    »    x,-(0)  +  ea^O). 


Substituting  the  values  of  Xj{0),  and  a^(0)  from  (1.8),  and  (1.9)  in  the  above  approx- 
imations, we  get 

\j(e)    «    XjW  +  qfiExj, 

xj(e)    «    Xj(0)  +  e±£^fXi.  (1.10) 

To  develop  an  algorithm  we  use  (1.10)  in  the  following  way.  Suppose  XAX~*  is 
an  approximate  diagonalization  of  B.  If  we  identify  E  in  (1.10)  with  B  —  A'AA'-1. 
then  to  the  first  order: 

Xj(e)    «     \J(0)  +  eyJ(B-X\X-1)x, 

«    A^OJ  +  qfjBxi-cA^O), 

*i(«)    ~    ij(0)  +  eE— j : — *, 

.=i  A;  —  **« 

•Vi 

*  »i(o)  +  eEr — r*«- 

With  e  =  1,  these  approximations  for  an  eigenvalue,  and  corresponding  eigenvec- 
tor give: 


Xnerv   =    [yoldj      Bj.old  for     ■  _  j  .  ^ 


„o/<f 


"   fv^l    fix0' 
x"-  =  xf  +  y;\      ^      x    J    ^    forj==l:n,  (1-11) 

■  =  i     ^j         —  *i 
>*} 

yrnew  _  /  Yneu,1-1 

We  can  use  the  above  expressions  to  formulate  an  algorithm  to  compute  a  diago- 
nalization of  a  nondefective  matrix  A  as  follows: 

Algorithm  1. 0.1  (Diagonalization)  Given  a  nondefective  matrix  .4  €  C"x",  a  matrix 
of  approximate  eigenvectors  A'0,  and  a  tolerance  tol  greater  llian  the  unit  roundoff, 
the  following  algorithm  computes  a  diagonalization  A  =  AAA'-1. 

Dif=l;   X-X0-  [xuX2y...,xn]\    Y  =  X-1  =  [yuy2 yn  }T 

until  Dif  <  tol 


C  =  0;  {  C  =  [  ci,  c2, . . . ,  cn  ]  is  a  n  x  n  zero  matrix.     } 
for  j  =  1  :  n 

end 

for  j  =  1  :  n 
for  i  =  1  :  n 

ifi^j 

,  (yt)rAxj 

Cj  *~  Cj  +    A,  -  A,  X' 
end 

end 

^i  ^~  xj  +  cj>  ^j  *~~  ^j/ IfjII 

end 

r «-  x-1 

Dif  <—  ||C||f  ;  {  where  ||  ■  \\p  is  the  Frobenious  matrix  norm.    } 
end 
Above,  we  use  a  normalization  for  Xj  to  reduce  the  growth  of  X. 


Example  1.0.1  If  Algorithm  1.0.1  is  applied  to 


A 


8.0 

-1.4 

0.8 

6.5 

1.8 

2.3 

7.1 

1.3 

1.7 

5.0 

-3.1 

-4.7 

-0.1 

-2.4 

-1.3 

-4.5 

3.8 

-2.3 

-7.7 

0.2 

1.2 

0.7 

-0.5 

-0.2 

-1.4 

with  A'0  =  /,  and  tol  =  10"9,  then  the  eigenvalues  converge  as  shown  in  the  following 
table: 


Iteration 

A, 

A2 

A3 

A4 

A5 

1 

8.0000 

7.1000 

-0.1000 

-7.7000 

-1.4000 

2 

7.6170 

5.0012 

0.9699 

-6.7182 

-0.9700 

3 

7.1096 

5.47SS 

1.5900 

-6.1843 

-2.0941 

4 

8.7039 

4.5657 

0.9711 

-6.0978 

-2.2429 

5 

7.9836 

5.0217 

0.8616 

-6.0530 

-1.9139 

6 

7.9953 

4.9963 

0.8889 

-6.0571 

-1.9235 

7 

7.9943 

4.9964 

0.8892 

-6.0571 

-1.9229 

8 

7.9943 

4.9964 

0.8892 

-6.0571 

-1.9229 

The  exponents  of  convergence  p  in 


l-Xfc+i  —  x 


'MP 


for  different  values  of  k  are  given 


in  the  following  table,  where  Xk  is  the  matrix  of  eigenvectors  at  the  A-th  iteration, 
and  X  is  the  final  matrix  of  eigenvectors: 


k 

P 

1 

3.207 

2 

0.430 

3 

4.123 

4 

2.086 

5 

2.104 

6 

1.962 

Hence  the  method  appears  to  converge  quadratically.  (  N.  Ghosh  gave  a  theoretical 
proof  of  the  quadratic  convergence  of  the  diagonalization  method  ). 

I:\rani pic  1.0.2  If  Algorithm  1.0.1  is  applied  to 


55  7  61  51  79 

47  52  21  61  58 

74  33  4  79  60 

68  56  59  29  52 

42  73  26  20  6 


with  A'0  =  /.  and  lol  =  10   8.  then  the  approximate  eigenvalues  do  not  converge  to 
the  eigenvalues  of  A,  which  is  clear  from  the  following  table: 


Iteration 

A, 

A2 

A3 

A4 

A5 

1 

55 

52 

4 

29 

6 

100 

-1241.77 

-4853.27 

1586.28 

5100.90 

-446.15 

200 

124807.75 

-318959.32 

-368210.46 

369633.58 

192874.49 

300 

416577.75 

230821 

-768266.74 

-1252454 

1373474 

400 

539592 

-163324.75 

474518.5 

-868096 

17506.02 

500 

685138 

422251 

17628 

-235428.5 

-889446 

600 

-13438605.19 

-1459224.5 

7491094 

14400555 

-6993606 

700 

-72503.38 

-513144 

-3064817 

1601468 

2049060 

800 

70919.84 

586343.25 

787740.75 

-886619.25 

-558243.75 

900 

436916.5 

1537.75 

635731.38 

-511936 

-562132 

1000 

-2682893.03 

42945.63 

-901415.25 

-1409781.22 

4951288.04 

1100 

-1244105 

440503.5 

-11654959.5 

15728878 

-3270228.12 

The  eigenvalues  of  A  are  A  =  [233.3505,  27.1460,  -8.3339,  -43.2339,  -62.9287  ]. 

The  above  example  shows  that  in  general,  the  identity  matrix  can  not  be  used 
for  the  starting  guess  matrix  of  eigenvectors  for  a  nondefective  matrix  with  distinct 
eigenvalues. 

If  by  another  method,  a  matrix  of  eigenvectors  of  a  matrix  A  can  be  determined, 
then  this  method  can  be  used  to  find  the  eigenvalues,  and  corresponding  eigenvectors 
of  a  slightly  perturbed  matrix  A  +  eE,  where  E  is  an  arbitrary  matrix,  and  e  is  a 
small  scalar.  For  example,  first  reduce  the  matrix  A  to  an  upper  Hessenberg  form 
H  by  an  orthogonal  matrix  Q,  and  then  apply  the  QR  method  with  double  implicit 
shift  to  H  to  obtain  a  quasi-upper  triangular  matrix.  Next  solve  (H  —  ///)  v  =  0  to 
find  corresponding  eigenvector  v  of  the  eigenvalue  //.  Then  x  =  Qv  is  the  eigenvector 
of  A  corresponding  to  the  eigenvalue  \i.  Now  these  eigenvectors  can  be  used  as 
approximate  eigenvectors  in  the  diagonalization  method  to  find  the  eigenvalues,  and 
corresponding  eigenvectors  of  the  perturbed  matrix  A  -f  tE. 


Example  1.0.3  We  reduce  the  matrix  A  in  Example  1.0.2,  first  to  an  upper  Hessenberg 
form,  and  then  apply  the  QR  method  with  double  implicit  shift  with  tol  =  10-6.  After 
6  iterations,  we  obtain  a  real  Schur  decomposition  S1 AS  =  U,  and  the  eigenvalues  of 
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f/areA  =  [233.3505,  27.1460,  -8.3339,  -43.2339,  -62.9287].  To  find  an  eigenvector 
x  corresponding  to  the  eigenvalue  //,,  let  B  —  A  —  /*,/.  Then  solve  Bx  =  0  for  x. 
Here  is  the  matrix  of  eigenvectors: 


0.4674 

0.5909 

0.3002 

-0.7147 

0.2386 

0.4487 

-0.6005 

0.0657 

-0.1294 

0.3043 

0.4730 

0.2783 

-0.4712 

0.3346 

0.5919 

0.4986 

0.1176 

-0.6361 

0.0707 

-0.4198 

0.3284 

-0.4460 

0.5280 

0.5962 

-0.5691 

X  = 


Next  consider  the  following  matrices  A\,  and  A2,  which  we  get  by  perturbing  the 
elements  of  A: 


55.056 

7.065 

61.012 

51.021 

79.023  " 

47.032 

52.034 

21.045 

61.067 

58.07G 

74.078 

33.087 

4.013 

79.031 

60.021 

68.042 

56.035 

59.053 

29.046 

52.057 

42.075 

73.068 

26.086 

20.079 

6.097 

Aj 


55.56  7.65  61.12  51.21  79.23 

47.32  52.34  21.45  61.67  58.76 

74.78  33.87  4.13  79.31  60.24 

68.42  56.35  59.53  29.46  52.57 

42.75  73.68  26.86  20.79  6.97 

If  we  use  Algorithm  1.0.1  to  Ai,  and  A2  with  A"0  =  X,  and  tol  =  10-6,  then  af- 
ter 3  iterations  we  obtain  diagonalizations  Y~x A\Y  =  D\,  and  Z~x A-iZ  =  D2, 
whereas  if  we  apply  the  QR  method  with  double  implicit  shift  to  STA\S,  and  STA2S, 
then  after  4  iterations  we  obtain  real  Schur  decompositions  PT(ST A\S)P  =  ft,  and 
QT(STAlS)Q  =  T. 

The  derivation  of  the  results  in  (1.8).  and  (1.9)  can  be  found  in  many  Numer- 
ical Linear  Algebra  texts.  The  derivation  of  the  formulas,  which  are  used  in  Al- 
gorithm 1.11  can  be  found  in  [HagSS].  The  result  in  (1.9)  holds  only  when  the 
eigenvalues  of  A  arc  all  distinct.  So  a  natural  question  is  to  ask  whether  we  can 
derive  equivalent  results  for  matrices  with  multiple  eigenvalues. 
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In  Section  2.1,  we  will  show  how  Algorithm  1.11  can  be  generalized  to  diagonalize 
a  nondefective  matrix  with  multiple  eigenvalues.  In  Section  2.4,  we  will  show  that 
the  generalized  algorithm  converges  locally  quadratically.  A  Differential  Equation 
Approach  to  Eigencomputations  is  introduced  in  Section  2.2  to  create  an  iterative 
method  to  find  the  eigenvalues,  and  corresponding  eigenvectors  of  a  nondefective 
matrix  A.  The  differential  equations  governing  the  block  matrices  have  the  form: 

A(t)  =  -A(t)  +  Diag  (Xitr'AXit))  ,    and  X(t)  =  X(t)F  (X(t),  A(t)) , 

where  Diag(M)  is  a  block  diagonal  matrix  formed  from  the  diagonal  blocks  of  the 
block  matrix  M,  Fj3(X(t),A(t))  is  a  square  zero  matrix,  and  Fij(X(t),A(t))  for 
i  zfi  j  is  the  solution  B  to  the  matrix  equation  BAj(t)  —  At(t)B  =  Yi(t)  AXj(t). 
Here  Yt(t)T  denotes  the  z'th  block  row  of  X(t)~x.  The  Euler  approximation  to  the 
differential  equations  can  be  expressed  as: 

An+1(0  =  An  +  At  (Diag  (x-'AXn)  -  A„)  ,  and  Xn+1(t)  =  Xn  +  AtXnF  (A'„,  A„) . 

(1.12) 
In  Section  2.3,  we  modify  (1.12)  by  replacing  the  fixed  stepsize  At  in  each  iteration 
by  a  variable  stepsize  s.  Armijo's  rule  from  optimization  theory  is  used  to  find  a 
stepsize  s,  for  which  \\A  -  Xn+i(s)An+1{s)X~ll(s)\\F  <  \\A  -  XnAnX~l\\F  holds. 
Example  2.3.2  shows  that,  when  a  good  approximation  to  the  matrix  of  eigenvectors 
is  not  known,  the  algorithm  in  Section  2.3  may  not  converge.  In  Section  2.5,  we 
discuss  the  factors  sensitivity  in  the  Schur  decomposition  relative  to  perturbations 
in  the  coefficient  matrix  to  compute  a  block  Schur  decomposition  of  a  matrix.  In 
Section  2.6,  we  present  an  algorithm  to  compute  a  block  Schur  decomposition  of  a 
matrix.  This  algorithm  is  based  on  sensitivity  results  for  the  factors  in  the  Schur 
decomposition  relative  to  perturbations  in  the  coefficient  matrix,  and  Armijo's  rule 
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from  optimization  theory.  The  starting  eigenvalues  associated  with  the  Schur  decom- 
position are  uniformly  distributed  points  on  a  circle  in  the  complex  plane,  where  the 
circle  includes  all  Gerschgorin  disks  for  the  matrix,  and  the  starting  guess  unitary 
factor  is  the  identity  matrix. 

In  Chapter  3,  we  discuss  how  to  exploit  the  structure  of  a  matrix.  Since  symmetric 
matrices  are  diagonalizable,  we  modify  the  diagonalization  method,  and  the  block 
diagonalization  method  to  exploit  the  matrix  symmetry  and  reduce  the  given  matrix 
into  a  diagonal  matrix  with  eigenvalues  along  the  diagonal. 


CHAPTER  2 
EIGENVALUES  OF  UNSYMMETRIC  MATRICES 

2.1     Block  Diagonalization  of  a  Matrix 

Let  A  be  an  n  x  n  nondefective  matrix.  Our  aim  is  to  find  an  iterative  method  to 
compute  a  block  diagonalization  of  the  form  A  —  XAX-1,  where  A  =  diag(  Ai,  A2, . . ., 
A<)  is  a  block  diagonal  matrix,  and  X  =  [Xi,  X2, . . .  ,Xt]  is  a  compatible  block  column 
matrix  such  that  AXj  =  XjAj.  Here  A3  =  diag(  Aj,  \j, . . . ,  Xj  )  is  an  mj  x  rrij  scalar 
matrix.  It  can  always  be  arranged  so  that  the  eigenvalues  of  A;  and  Aj  for  i  ^  j 
are  distinct.  We  will  use  a  sensitivity  result  for  eigenvalues  and  eigenvectors  to  block 
diagonalize  a  matrix.  In  the  derivation  of  necessary  equations  to  block  diagonalize  a 
matrix,  we  need  the  continuity  of  eigenvalues,  and  the  perturbation  theory  based  on 
Gerschgorin's  theorem. 

Recall  that  the  coefficients  of  the  characteristic  polynomial  of  a  matrix  B  are  con- 
tinuous functions  of  the  elements  of  B  [Wil65,  page  66],  and  the  zeros  of  a  polynomial 
depend  continuously  on  its  coefficients.  Proof  is  in  [Hen74,  page  281].  So  the  eigen- 
values of  B  depend  continuously  on  the  elements  of  B.  Also,  Rouche's  theorem  states 
that  if  /  and  g  are  analytic  in  a  neighborhood  of  a  closed  disk  D  =  {z  :\z  —  a\  <  R) 
centered  at  a,  /  has  no  zeros  on  DD  =  {z  :  \z  -  a\  =  R},  and  \g(z)\  <  \f{z)\  holds 
for  z  G  dD,  then  f(z),  and  f(z)  +  g{z)  have  the  same  number  of  zeros  in  D.  Detail 
is  in  [Hen74,  page  280]. 

Theorem  2.1.1  Let  A  be  an  eigenvalue  of  A  of  algebraic  multiplicity  k.   Then  for  any 
norm  \\  •  ||,  and  all  sufficiently  small  n  >  0,  there  is  a  8  >  0  such  that  if  \\E\\  <  8, 
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then  the  disk  D  =  {z  €  C  :  \z  —  A|  <  77}  centered  at  A  contains  precisely  k  eigenvalues 
ofA  +  E. 

Proof  of  Theorem  2.1.1  Proof  is  given  in  [Ste90,  page  167]. 

Next  we  will  discuss  the  effect  of  perturbations  in  the  elements  of  A  on  the  eigen- 
values, and  the  eigenvectors  in  its  diagonalization,  which  is  based  on  Gerschgorin's 
theorem.  Let  A  be  an  n  x  n  nondefective  matrix.  Suppose  X~lAX  =  A  is  a  block 
diagonalization  of  A,  where  A  is  a  block  diagonal  matrix.  If  A_,  denotes  the  jth  diag- 
onal block  of  A,  then  we  have  AXj  =  XjA.j,  where  Xj  is  the  jth  block  column  of  X. 
Given  an  arbitrary  matrix  E,  and  for  a  sufficiently  small  e,  we  will  show  in  the  next 
theorem  that  there  exist  continuously  differentiable  functions  Aj(e),  and  Xj(e)  with 
A,(0)  =  Aj,  and  X,(0)  =  Xj  such  that  {A  +  eE)XJ(t)  =  Xj(e)Aj(e). 

Theorem  2.1.2  Let  A  be  an  n  x  n  nondefective  matrix.  Suppose  X~lAX  =  A  is  a 
block  diagonalization  of  A,  and  if  Aj  is  the  jth  diagonal  block  of  A,  then  AXj  =  XjAj, 
where  Xj  is  the  jth  block  column  of  X .  Suppose  A,,  and  Aj  for  i  ^  j  have  distinct 
eigenvalues.  Then  given  an  arbitrary  n  x  n  matrix  E,  and  for  a  sufficiently  small 
(.  there  exist  continuously  differentiable  functions  Aj(e),  and  Xj(e)  with  Aj(0)  =  Aj, 
and  A'j(O)  =  Xj  such  that  the  following  matrix  equation  holds: 

(A  +  eE)XJ(e)  =  XJ(t)AJ(c).  (2.1) 

Proof  of  Theorem  2.1.2  Let  A'-1  AX  —  diag(  Ai,  A2, .  .  . ,  Aj  )  be  a  block  diagonaliza- 
tion of  A,  where  Aj  =  diag(  A,-,...,Aj  )  is  a  scalar  matrix,  and  let  m:  =  dim(Aj). 
Sui>|>ose  mj  >  1.  Let  E  be  an  n  x  n  matrix,  and  e  be  a  small  number.  Consider  the 
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matrix  D  =  X~l(A  +  tE)X.  The  form  of  this  matrix  is  illustrated  below  for  t  —  5: 

Ai 

A2 

+  e 


D 


A, 


A, 


Fn  Fu  F13  F14  F15 

F2\  F22  F22,  F24  F25 

F31  F32  F33  F34  F35 

F41  F42  F43  F44  F45 

F51  F52  F53  F54  F55 


here  Fi}  =  YfEXj,  Y?  is  the  ith  block  row  of  A'"1.  Let  B  =  diag(  Bu  B2, . . . ,  Bt ), 

,  £,  and  dim(5j)  =  mj,   j  = 


where  Bi   =  diag     -,...,-  I,   2?,-  =  J  for  t  =  2, 
\P  P/ 

l,...,i.  Let  C  =  BDB~\  then 


C  = 


Aj 


A, 


A, 


A, 


+ 


eFn 
pF2\ 

P 

eF22 

^,3 

P 

tF23 

P 
cF24 

^F15 

P 

eF25 

pF3i 

eF32 

tF33 

eF34 

eF35 

pF4\ 

eF42 

tFA3 

eF44 

eF45 

pF51 

tF52 

eF53 

eF54 

eF55 

Next,   let   U~lFnU    =    £   be  a  diagonalization  of  Fn,    where  S    =    diag(  a\,o2, 
. . . ,  crm,  )  with  some  of  a^s  may  be  identical.  Let  W  —  diag(  [/,  i?2,  ■  ■  ■  ,Bt).  Then 

Ax 


W_1CVK 


A, 


A, 


A4 


A, 


+ 


■U~1F12    -U~XFXZ    -U-XFX4    -U~1F15 


cF22 
eF32 
eF42 
tF52 


P 


PF21U 

pF31U 

PF41U 

L  pFsiU 

We  may  choose  a  value  of  p,  which  is  independent  of  e  such  that  the  first  n?x  disks 


eF23 
eF33 
eF43 
eF53 


eF24 
eF34 
eF44 
eF54 


eF25 
eF35 
eF45 
eF55 


are  disjoint  from  the  other  n  —  mi  disks.  So  for  a  proper  value  of  p,  ??ix  eigenvalues  of 
W~lCW  lie  in  the  union  of  the  mi  disks,  which  are  centered  at  Ax+eoj,  j  =  l,...,mi. 
The  radius  of  each  disk  is  of  order  e2,  and  the  perturbation  in  each  eigenvalue  Ax  is 
of  order  e  as  e  — >  0. 
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Next  we  discuss  the  effect  of  perturbations  in  the  elements  of  A  on  the  eigenvectors 
in  its  block  diagonalization.  As  earlier,  let  X~XAX  =  diag(  Al7  A2, . . . ,  A, )  be  a  block 
diagonalization  of  A,  where  A;  =  diag(  A;,  ...,Aj  )  is  a  scalar  matrix,  and  nij  = 
dim(Aj).  Clearly  the  columns  X!,x2, . . .  ,xn  of  X  form  a  complete  set  of  eigenvectors. 
We  denote  eigenvectors  of  A  +  tE  by  T1(e),x2(c), . . . ,  xn(c),  and  eigenvectors  of  D  = 
X~X(A  +  eE)X  by  ui(e),u2(e),. .  .,u„(e)  such  that  x,(e)  =  Xu,(e). 

To  illustrate,  we  consider  a  matrix  D  with  /  =  n— 2  such  that  Aj  =  diag(  Ai,  \x,  Ai ), 
Aj  =  AJ+2,  j  =  2,  ...,n  -2: 
A, 


£> 


A, 


A, 


A, 


A„ 


+  e 


/ll  /l2  /13  /l4  "  "  '  /in 

/21  /22  /23  /24  •  '  •  /2n 

./31  /:i2  /33  /34  -  '  -  /3n 

/ll  /*2  /43  ^44  •••  fin 

Jn\  Jn2  Jn3  J n4  '  Jnn 


where  ftJ  =  yj E.Vj,  and  yj  is  the  ith  row  of  X~x.  First  consider  the  simple  eigenvalue 
A4.  Since  (A4(e),  u4(e))  is  an  eigenpair  of  D,  and  A4(0)  =  A4,  so  we  must  have 
u4(0)  =  e4,  which  is  the  fourth  unit  vector.  Let  u4(e)  be  normalized  so  that  its 
largest  element  is  1.  For  sufficiently  small  e,  claim  that  the  fourth  element  of  u4(t)  is 
1.  If  not,  let  u,4(c)  =  1  for  some  i,  i  ^  4.  Since  DuA(e)  =  A4(e)u4(e),  so  equating  the 
ith  component  on  both  sides,  we  get 


that  is 


A4(e)tt,-4(e)  =  A,»l4(e)  +  ej^  fytij^e), 


A4(t)  -  A,  =  cJ2fvllji(<)- 


(2.2) 


Now  letting  e  — >  0,  we  find  A4  —  A,  =0,  i  ^  4,  a  contradiction.    So  for  sufficiently 
small  e,  u44(e)  =  1.  Next  we  will  show  that  the  remaining  components  of  114(e)  arc 
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all  less  than  Me  as  e  — *  0  for  some  M  >  0.  Since  |u,4(e)|  <  1,  so  from  (2.2),  we  get 

|A4(e)-A!||u!4(0l<^El^l- 
Let  |A4(e)  —  A,|  >  -|A4  —  A,-|  for  i '  ^  4,  i  =  1,...  ,  rc,  then  |u;4(e)|  <  Me,  where 


M 


max 
i  ±\ 


(  n 

2EIA»I 

■7=1 

|A4  —  Aj- 


Thus  u,4(e)  are  of  order  e  for  i '  ^  4,  z  =  1, . . .  ,ra.  Again  from  (2.2),  we  obtain 

n 

(A4(e)  -  Ai)it,-4(e)  =  eft4  +  e  ^  fijUj4(e). 


J*4 


The  second  term  on  the  right  is  of  order  e  ,  so  for  1^4,  z  =  1, . . .  ,n,  we  have 


Uj4(e)  - 


Aa    —    Ai 


0(6"). 


Now  we  turn  to  eigenvectors  corresponding  to  the  multiple  eigenvalue  Aj.  Suppose 
u\(e)  is  a  normalized  eigenvector  of  D  corresponding  to  the  eigenvalue  \i{e).  Using 
a  similar  argument  as  in  the  case  of  the  simple  eigenvalue  A4(e),  we  can  show  that 
the  largest  element  of  Ui(e)  can  not  be  u,i(e),  i  =  4, . . .  ,  n.  In  fact,  these  elements 
are  of  order  e.  Therefore  the  normalized  ui(e)  must  have  one  of  the  following  forms: 


1 
Pi(e) 
P2(e) 
u4l(e) 

«ni(e) 


,   or 


pi(<0 

1 

u41(e) 


,    or 


Pi(«) 
P2(e) 

1 
u4l(e) 

.  Wnl(0   . 


.  wnl(e) 

where  |p,-(e)|  <  1,  z  =  1,2.  Using  the  same  technique,  as  in  the  case  of  ui(e),  we 
can  show  that  ^2(^)1  an<^  u3(e)  a^so  must  have  one  of  the  above  forms.  Since  the 
dimension  of  the  diagonal  block  Ai  is  three,  and  Ai  is  a  scalar  matrix,  so  we  have 
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three  degrees  of  freedom  to  normalize  i/i(e),  two  degrees  of  freedom  to  normalize 
u2{t),  and  one  degree  of  freedom  to  normalize  t*3(e).  Hence  we  take 


u,(c)  = 


1 

0 

0 

u41(e) 


,    "2(e)  = 


0 

1 

0 

"42(e) 


,    and  u3(e)  = 


0 

0 

1 

"43(e) 


"m(e)  J  L  ""2(e)  J  L  "n3(e) 

As  in  the  case  of  the  simple  eigenvalue  A4(e),  we  can  show  that 


Me) 


=  0(O,  i  =  4,...,n,;  =  1,2,3, 


A,  -  A, 

where  Aj  =  A2  =  A3.  Hence  Aj(e),  and  Xj{e)  are  continuously  differentiable  functions 
of  t.  This  completes  the  proof  of  the  theorem.    □ 

Now  we  are  ready  to  derive  the  necessary  equations  to  obtain  a  block  diagonal- 
ization  A  —  AAA'"1  of  a  nondefective  matrix  A.  Let  Aj  be  the  jth  block  diagonal 
element  of  A,  and  Xj  be  the  corresponding  block  column  vector  of  X.  Given  an  n  x  n 
matrix  E,  and  for  a  sufficiently  small  t  £  R,  let  A_,(e),  and  A'j(e)  be  differentiable 
functions  of  e  such  that  A_,(0)  =  Aj,  and  Xj(0)  =  Xj.  Differentiating  (2.1)  with 
respect  to  t  and  then  putting  t  =  0,  we  obtain 


AX^O)  +  EX;  =  X'^Aj  +  XjA'jiO). 


(2.3) 


Since  the  columns  of  X  are  linearly  independent,  so  we  can  express  Xj(t)  in  the 
following  way: 

AJ(0  =  LA'./^(')-  (2-4) 

1=1 

where  Bij(t)  is  an  m,  x  irij  matrix.  We  normalize  (2.4)  by  taking  B}J{<-)  =  /•  Differ- 
entiating (2.4)  with  respect  to  <  ;uid  setting  1  =  0,  we  have 


a;(o)  =  ;ta,/^(o). 


(2.5) 


1=1 
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Substituting  (2.5)  into  (2.3),  we  obtain 


A  £  XiB'^0)  +  EXj  =  £  XiBtjMAj  +  A>A;(0) 

i=i  1=1 

£A.-AtfJ,(0)  +  EX,  =J2XiB'ij(0)Aj  +  X^O) 

1=1  1=1 

EX3  =J2*i  K(0)A,  -  Mfc(0))  +  XjAbiO).  (2.6) 


1=1 


Since 


*?* = { ?  t  a  i,  <"» 


where  V^T  is  the  zth  block  row  of  X  1.  Premultiplying  (2.6)  by  YT ,  and  Y?  respec- 
tively, yields 

A}(0)  =  YjEXh  (2.8) 

5^(0)^^4(0)  =  ^^.  (2.9) 

Equation  (2.9)  is  a  linear  equation  in  the  unknown  B\ ,(0).  Here  we  mention  two  well- 
known  methods  to  find  B[-{Q).  To  derive  these  two  methods,  we  need  the  following 
two  theorems. 

Theorem  2.1.3  [Hag88,  page  317]  If  B  e  Cpxp ,  and  C  €  Crxr  are  nondefective 
matrices,  then  the  solution  Z  to  the  matrix  equation 

ZB-CZ  =  R  (2.10) 

is  given  by  Z  =  VWU~X ,  where  B  =  UTiU-1 ,  and  C  =  VQV'1  are  diagonalizations 

{V-iRU)^ 

of  B,  and  C  respectively,  and  W  =  (wij)  is  a  r  x  p  matrix  with  iw,j  = -. 

a j  —  u!t 

Proof  of  Theorem  2.1.3  Let  B  =  UEU~\  and  C  =  VflV'1  be  diagonalizations  of  B, 
and  C  respectively.  Then  from  (2.10),  we  get 

zuw-1  -  vnv^z  =  r 
v-*zuY,-nv-xzu   =   V~lRU. 
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Let  W  =  V~lZU,  and  D  =  V^RU,  then  the  last  equation  reduces  to  WL  -  fill'  = 

D.  Comparing  the  (?',  j)  entry  on  both  sides  of  the  preceding  relation,  we  have 

(V-'RU),- 
Wij  (<Tj  -  Ufi)  =  dtJ,      or      Wij  =  ,       i  =  l,2 r,      j  =  1,2 p. 

a  j  -  u,', 

and  with  this  W,    Z=  VWU-1.      □. 

Theorem  2.1.4  If  A  G  Rpxp,  and  B  €  Rrxr,  then  the  solution  Z  to  the  matrix  equation 

AZ-ZB  =  C  (2.11) 

is  given  by  Z  =  UPV  ,  where  A  =  URUT ,  and  B  =  VSVT  are  real  Schur  decompo- 
sitions of  A,  and  B  respectively,  and  P  is  the  solution  to  the  Sylvester  equation 

RP  -  PS  =  UTCV. 

Proof  of  Theorem  2.1.4  Let  A  =  U RUT ,  and  B  =  VSVT  be  real  Schur  decomposi- 
tions of  A,  and  B  respectively.  Then  (2.11)  becomes 

URUTZ  -  ZVSVT    =    C 
RUTZV  -UTZVS    =    UTCV. 

Let  P  =  UTZV,  and  D  =  UTCV.  Then  the  last  equation  reduces  to  RP  -  PS  =  D, 
which  is  a  Sylvester  equation.  For  the  detailed  solution  of  a  Sylvester  equation  readers 
are  referred  to  [Gol90,  page  387].  □ 

Next  we  will  show  how  to  use  the  above  two  theorems  to  derive  two  methods  to 
find  the  unknown  B^-(O)  in  (2.9). 

Method  1  (  Theorem  2.1.3  ):  Let  P  =  B-,(0),  and  R  =  Y?EXj,  then  (2.9)  reduces 
to 

P\J-\,P  =  R.  (2.12) 

Solving  (2.12)  for  /\  we  get  P  =  VWU~X,  where  Aj  =  VW'K  and  A,  =  I'fil'-1  are 

{V-xBU)u 


diagonalizations  of  A^,  and  A,  respectively,  and  \Y  =  (u'ki)  with  tow 


0~I  —  u.'k 
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Method  2  (  Theorem  2.1.4  ):  Let  P  =  5^(0),  and  C  =  YtTEX3,  then  (2.9) 
becomes 

PA3-AlP  =  C.  (2.13) 

Solving  (2.13)  for  P,  we  obtain  P  =  VZUT,  where  Aj  =  URUT,  and  A,  =  VSVT 
are  real  Schur  decompositions  of  Aj,  and  A,  respectively,  and  Z  is  the  solution  to  the 
Sylvester  equation  SZ  -  ZR  =  W  with  W  =  -VTCU. 

Combining  the  analysis  of  the  effect  of  perturbations  in  the  elements  of  a  non- 
defective  matrix  A  on  the  eigenvalues,  and  corresponding  eigenvectors  in  its  diago- 
nalization,  and  the  results  from  (2.5),  (2.8),  and  (2.9)  we  get  the  following  Taylor 
Expansions  of  Aj(e),  and  Xj(t): 

Theorem  2.1.5  Let  A  be  a  nondefective  matrix,  A3  =  diag(\31 .  .  . ,  \3)  be  a  scalar 
matrix,  where  \3  is  an  eigenvalue  of  A  of  multiplicity  dim(Aj);  let  X}  be  the  matrix 
of  linearly  independent  eigenvectors  of  A  corresponding  to  the  eigenvalue  \3  such  that 
AXj  =  XjAj.  Given  an  arbitrary  matrix  E,  and  for  a  sufficiently  small  t,  let  Aj(t), 
and  Xj(e)  be  continuously  difjerentiable  functions  of  e  such  that  (A  +  eE)X3(e)  = 
X3{e)A3{t),  Aj(0)  =  Ajt  and  X3{0)  =  X3.   Then 

A3(e)  =  A3  +  eYJTEX3  +  0(e2), 

Xj(e)  =  Xj  +  eY,XiPij  +  0(e2), 

>*} 
where  P  =  Pi3  is  the  solution  to  the  matrix  equation  PA3  -  A,P  =  Y?EX3. 

We  use  Theorem  2.1.5  to  develop  an  algorithm  in  the  following  way.  Suppose 
XAX~*  is  an  approximate  block  diagonalization  of  a  nondefective  matrix  B.  If  we 
identify  E  in  Theorem  2.1.5  with  B  -  XAX~\  then  to  the  first  order: 

A3(e)    m    A,-  +  tYf  (B  -  XAX~X)  Xj 
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w    \j  +  eY]TBX}  -  eAj, 

Xj(e)    m    Aj  +  e^A.P,, 
1=1 

where  P  =  PXJ  is  the  solution  to  PAj  -  \,P  =  YfBXj. 

With  e  =  1,  these  approximations  lead  to  the  following  block  version  of  Algo- 
rithm 1.11: 

Anew  _  ^Yf^fBXf*  for  i  =  1  :i, 

Y-u,  =    YoW  +  £  jqWj*.       forJ-  =  1:t]  (2   H) 

1=  1 

v/ricit/   /  V'neu' \~*  1 

where  P  =  Ptj  is  the  solution  to  PAf™  -  AJ^P  =  (Y,M)T  BXfd. 

With  a  matrix  of  approximate  eigenvectors  A'o,  we  can  use  the  above  expressions 
to  compute  the  block  diagonalization  of  a  nondefective  matrix  A  in  the  following  way. 

Algorithm  2.1.1  (Block  Diagonalization)  Given  an  n  x  n  nondefective  matrix  A,  a 
matrix  of  approximate  eigenvectors  Ao,  and  a  tolerance  tol  greater  than  the  unit 
roundoff,  the  following  algorithm  computes  a  block  diagonalization  .4  =  AAA'-1. 

Dif  =  1;  X  =  X0=[X1,    A2,    ...,    Xk];Y  =  X-1  =  [Y1,Y2 Yk  ]T 

until  Dif  <  tol 

C  =  0;    {  C  =      C\,    C2,     •  •  ■ ,    Ck      is  a  ?!  x  n  zero  matrix.  } 
for  j  =  1  :  k 

A,  =  (3   \TAXj 
end 

for  j  =  1  :  k 
for  i  =  1  :  k 

Solve  ZA,  -  A,Z  =  IfAYj  for  Z. 
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end 
end 

Xj  =  Xj  +  Cj 


xi=xi/\\xj\\   for/  =  l:mi. 


{  Xj  =  [  x{,  x{, . . .  ,x*m      is  column  partitioning.  } 
end 

Y  =  X~1;    Dif=||C||F 
end 

Above,  we  normalize  Xj  to  control  the  growth  of  X. 

The  main  problem  in  this  method  is  how  to  choose  a  matrix  of  approximate 
eigenvectors  for  the  starting  guess.  If  by  another  method,  a  matrix  of  eigenvectors  of 
a  matrix  A  can  be  determined,  then  this  method  can  be  applied  with  X  as  the  starting 
matrix  of  eigenvectors  to  find  the  eigenvalues,  and  corresponding  eigenvectors  of  a 
slightly  perturbed  matrix  A  +  cE,  where  E  is  an  arbitrary  matrix,  and  e  is  a  small 
number.  For  example,  first  reduce  the  matrix  A  to  an  upper  Hessenberg  form  H  by 
an  orthogonal  matrix  Q,  and  then  apply  the  QR  method  with  double  implicit  shift 
to  H  to  obtain  a  real  Schur  decomposition  STHS  =  U.  To  obtain  an  eigenvector  y 
corresponding  to  the  eigenvalue  m,  let  B  =  H  -  ml.  Next  solve  By  =  0  for  y  in  the 
following  way.  Put  y(i)  =  1,  and  then  solve  [B  (:,  1  :  i-l)  B(:,i+1  :  n)]z  =  -B(:,i)by 
using  the  QR  factorization  technique  to  solve  an  overdetermined  system  of  equations. 
Put  j/(l  :  i  -  1)  =  2(1  :  i  -  1),  and  y(i  +  1  :  n)  =  z(i  :  n  -  1).  Then  x  =  Qy  is 
the  eigenvector  of  A  corresponding  to  the  eigenvalue  \.ix.  Now  these  eigenvectors  can 
be  used  in  the  diagonalization  method  to  find  the  eigenvalues,  and  corresponding 
eigenvectors  of  the  perturbed  matrix  A  +  eE. 
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Example  2.1.1  Consider  the  matrix 


1  2 

2  2 

1    3 


We  reduce  A,  first  to  an  upper  Hessenberg  form  H,  and  then  apply  the  QR  method 
with  double  implicit  shift  with  tol  =  10-6.  After  8  iterations,  we  get  a  real  Schur 
decomposition  S7 AS  =  U.  The  eigenvalues  of  U  are  A  =  [  4.0394,  —1.0197  + 
0.4450s,  —1.0197  —  0.4450?  ].  Next,  we  use  the  technique  discussed  in  the  last  para- 
graph to  find  corresponding  eigenvectors.  Here  for  each  pair  of  complex  conjugate 
eigenvalues,  we  solve  the  equation  By  =  0  only  for  one,  and  then  take  the  real  and 
the  imaginary  parts  of  y  as  two  vectors  corresponding  to  the  complex  conjugate  pair 
of  eigenvalues.  The  corresponding  matrix  is: 


X  = 


0.5921  0.2979  0.7927 
0.7387  0.1265  -0.2726 
0.3222     -0.9462    -0.5453 


Next  consider  the  following  matrix  Ai,  which  we  get  by  perturbing  the  elements  of 
A: 

■4,= 


1.010    2.002       1.003 
2.004    2.005        1.006 
-1.007    3.009    -1.008 


If  we  use  Algorithm  2.1.1  to  A\  with  Xq  =  A.  and  tol  =  10-6,  then  after  2  iterations 
we  get  a  block  diagonalization  Y~1AlY'  =  Z)j,  and  if  we  apply  the  QR  method  with 
double  implicit  shift  to  STAxS,  then  after  2  iterations  we  obtain  a  real  Schur  form 
QT(STA1S)Q  =  R  as  well 

Example  2.1.2  Consider  the  matrix 


13  = 


0 

-9 

-3 

-6 

6 

5 

3 

3 

6 

6 

2 

3 

6 

6 

0 

5 

25 


If  we  apply  the  QR  method  with  double  implicit  shift  with  tol  =  10-6  to  B,  then 
after  1  iteration,  we  obtain  a  real  Schur  form  STBS  =  U,  and  the  eigenvalues  of  U 
are  A  =  [— 1,  —1,  2,  2].  Next,  we  use  the  technique  discussed  prior  to  Example  2.1.1 
to  find  corresponding  eigenvectors.  Here  is  the  matrix  of  eigenvectors: 


X 


0.7559  0  0.6547  0 

-0.3780  -0.5774  -0.4364  -0.4082 

-0.3780  0.5774  -0.4364  -0.4082 

-0.3780  0.5774  -0.4364  0.8165 


Next  consider  the  following  matrix  f?i,  which  we  get  by  perturbing  the  elements  of 
B: 


Bx  = 


-9.9931  -8.9947  -2.9930  -5.9995 

6.0059  5.0009  3.0091  3.0074 

6.0093  6.0065  2.0076  3.0033 

6.0085  6.0042  0.0026  5.0063 


If  we  apply  Algorithm  2.1.1  to  B1  with  X0  =  X,  and  tol  =  10-6,  then  after  2 
iterations  we  obtain  a  block  diagonalization  Y~l  B{Y  =  D\,  whereas  if  we  apply  the 
QR  method  with  double  implicit  shift  to  STB\S,  then  after  3  iterations  we  obtain  a 
real  Schur  form  QT(STB1S)Q  =  R. 


Example  2.1.3  Consider  the  matrix 
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C  = 


"  2 

-5 

7 

7 

9 

4 

5 

8 

-3  - 

-5 

5 

-9 

7 

4 

7 

9 

8 

3 

-5 

7 

—  7 

6 

9 

3 

4 

8 

5 

2 

3 

4 

5 

3 

-9 

9 

-7 

5 

4 

3 

9 

5 

6 

7 

7 

10 

9 

2 

3 

4 

5 

6 

6 

8 

-3 

5 

5 

10 

7 

6 

9 

6 

7 

9 

2 

2 

3 

9 

7 

2 

7 

4 

4 

1 

6 

3 

5 

5 

10 

1 

2 

3 

9 

4 

1 

9 

4 

1 

9 

-9 

2  - 

-7 

2 

7 

9 

3 

9 

5 

5 

6 

8 

8 

3 

5 

5 

1 

4 

6 

9 

4 

7 

9 

5 

7 

7 

2 

9 

9 

9 

5 

1 

5 

2 

10 

2 

6 

6 

0 

0 

8 

7 

3 

6 

2 

2 

-8 

4 

10 

9 

4 

3 

7 

8 

1 

4 

6 

2 

8 

2 

10 

3 

2 

6 

2 

3 

7 

1 

8 

5 

6 

6 

3 

5 

3 

6 

5 

4 

3 

4 

4 

4 

5 

9 

5 

9 

3 

5 

5 

7 

4 

6 

10 

9 

3 

10 

2 

8 

7 

5 

8 

_2 

2 

0 

5 

-8 

2 

9 

8 

7 

4 

-1 

6 

5   7 

-1 

4 

—  7 

4 

7 

6 

9 

8  " 

3   6 

8 

10 

4 

2 

10 

7 

8 

7 

2   5 

9 

9 

-6 

9 

5 

5 

3 

10 

5   9 

-5 

8 

8 

8 

1 

_2 

7 

9 

8   4 

8 

3 

4 

5 

5 

3 

2 

2 

8   2 

2 

7 

1 

-9 

3 

0 

4 

-3 

4   5 

1 

6 

8 

6 

10 

6 

1 

10 

6   1 

7 

6 

8 

2 

5 

1 

2 

-3 

1   3 

1 

8 

5 

3 

5 

8 

9 

7 

5   10 

6 

4 

8 

7 

7 

10 

10 

9 

7   8 

*7 

7 

_2 

9 

5 

5 

—  7 

i 

0   2 

4 

3 

5 

9 

4 

5 

8 

1 

1    7 

7 

2 

8 

7 

8 

-9 

-8 

8 

3   8 

8 

4 

3 

2 

-1 

2 

10 

2 

1   2 

6 

7 

8 

7 

8 

7 

6 

-6 

7   1 

2 

5 

9 

8 

6 

7 

3 

1 

1    3 

-8 

1 

3 

7 

7 

5 

7 

7 

2   7 

3 

5 

5 

8 

-1 

3 

6 

5 

7   7 

1 

6 

8 

10 

7 

3 

8 

2 

r)  -7 

5 

4 

2 

4 

C 

7 

7 

5 
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If  we  apply  the  QR  algorithm  with  double  implicit  shift  with  tol  —  10  6  to  C, 
then  after  27  iterations  we  get  a  real  Schur  form  STCS  =  U,  and  the  eigenvalues 
of  U  are  A  =  [92.2615,  -20.8808,  13.5193  +  10.7480?,  13.5193  -  10.7480?',  0.2459  + 
15.2499*,  0.2459  -  15.2499?',  6.3218  +  13.5313,  6.3218  -  13.5313,  -9.3504  +  7.8883?', 
-9.3504  -  7.8883?',  9.0619  +  8.7751?',  9.0619  -  8.7751i,  1.5718  +  10.4593?,  1.5718  - 
10.4593?,  8.0296,  -7.3844  +  1.4578»,  -7.3844  -  1.4578?',  0.6072,  -3.1993,  2.2102  ]. 
Next,  we  use  the  technique  discussed  prior  to  Example  2.1.1  to  find  correspond- 
ing eigenvectors.  Here  for  each  pair  of  complex  conjugate  eigenvalues,  we  solve  the 
equation  Bx  =  0  only  for  one,  and  then  take  the  real  and  the  imaginary  parts  of  x  as 
two  vectors  corresponding  to  the  complex  conjugate  pair  of  eigenvalues.  We  denote 
the  matrix  of  vectors  by  X.  Next  consider  the  matrix 


P  = 


7 

7 

9 

2 

7 

5 

9 

6 

9 

8 

3 

7 

5 

2 

1 

3 

9 

7 

5 

10 

10 

6 

7 

8 

6 

4 

2 

2 

9 

9 

6 

7 

1 

4 

3 

4 

9 

3 

5 

9 

4 

2 

9 

1 

10 

2 

9 

3 

7 

7 

8 

4 

2 

3 

10 

4 

8 

1 

7 

5 

1 

10 

5 

10 

4 

2 

9 

5 

8 

1 

5 

8 

2 

2 

9 

5 

1 

9 

2 

8 

3 

9 

7 

2 

9 

3 

4 

6 

8 

3 

0 

2 

6 

8 

6 

3 

4 

9 

3 

3 

5 

6 

9 

1 

3 

6 

5 

4 

3 

6 

6 

5 

6 

7 

0 

2 

5 

3 

2 

6 

2 

1 

2 

6 

3 

8 

2 

3 

4 

4 

7 

10 

3 

3 

0 

3 

1 

3 

7 

4 

2 

3 

3 

5 

1 

2 

5 

7 

1 

1 

8 

6 

1 

2 

8 

9 

3 

6 

5 

4 

4 

2 

5 

9 

7 

2 

9 
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6 
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5 

6 

4 

4 

6 

0 

4 

9 

8 

7 

4 

9 

6 

5 

10 

3 

7 

9 

10 

3 

4 

0 

6 

3 

7 

8 

5 

6 

4 

1 

2 

1 

1 
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3 

9 
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10 

10 

7 

4 

1 
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9 
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10 
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1 

6 

1 

5 

9 

1 

2 

5 

8 

7 

5 

7 

3 

8 

2 

9 

4 

3 

1 

9 

10 

6 

4 

3 

2 

5 

8 

6 

If  we  use  Algorithm  2.1.1  to  the  perturbed  matrix  C\  =  C+P/1000  with  X0  =  X,  and 
tol  =  10-6,  then  after  3  iterations  we  obtain  a  block  diagonalization  Y~lC\Y  =  D\, 
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whereas  if  we  apply  the  QR  method  with  double  implicit  shift  to  STCiS,  then  after 
27  iterations  we  obtain  a  real  Schur  form  QT(STCiS)Q  =  R. 

2.2      A  Differential  Equation  Approach  to  Eigencomputations 

In  the  previous  section,  we  found  that  the  main  problem  in  using  the  block  di- 
agonalization  method  is  how  to  choose  a  matrix  of  approximate  eigenvectors  for  the 
starting  guess.  Here  we  will  study,  whether  the  Euler  method  can  be  used  to  block 
diagonalize  a  nondefective  matrix  A  with  the  identity  matrix  as  the  matrix  of  approx- 
imate eigenvectors.  In  numerical  ordinary  differential  equations,  one  way  to  obtain 
the  Euler  method  is  to  drop  the  2nd  order  error  term  in  the  Taylor  Series  of  the 
given  function.  In  Theorem  2.1.5  dropping  the  2nd  order  error  terms,  we  get  the 
expressions  for  Aj(e),  and  Xj(e),  which  can  be  used  in  Euler 's  method.  Now  we  will 
discuss  how  to  implement  the  Euler  method  to  find  the  eigenvalues  of  a  nondefective 
matrix  A.  Euler's  method  for  the  differential  equation  z(t)  =  /(;(/))  has  the  form: 

2n+l   =  =n  +  At/(zn), 

where  zn  is  the  approximation  to  z(nAi),  and  A/  is  the  constant  time  step.  Here  c. 
and  /  are  vector  functions  of  /  in  general.  In  the  block  diagonalization  procedure, 
we  attempt  to  generate  a  block  diagonal  matrix  A,  and  an  associated  block  matrix 
A'  such  that  AX  =  A'A.  If  Aj  denotes  the  jth  diagonal  block  of  A.  then  we  have 
AXj  =  XjAj.  The  differential  equation  governing  the  block  matrices  has  the  form: 

\i/)  =  -A(0  +  Diag(A'(/)-1/l.Y(/)),       and      X(t)  =  X(t)F  (X(t),A(i))  , 

where  Diag(ilZ)  is  a  block  diagonal  matrix  formed  from  the  diagonal  blocks  of  the 
block  matrix  M,  Fjj(X(t),  A(i))  is  a  square  zero  matrix,  and  /•',_,(  A'(/),  A(/))  for  i  ^  j 
is  the  solution  B  to  the  matrix  equation  BAj(t)—Ai(t)B  =  )',(/)' .1A,(/).  Here! 
denotes  the  /th  block  row  of  X(t)~l.    For  convenience,  we  will  write  A  =  A(/).  and 
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X  =  X(t).  Typical  starting  guess  is  A(0)  =  Diag(yl),  and  X(0)  =  I.  Hence  the 
Euler  approximation  to  the  differential  equations  can  be  expressed  as: 

An+1  =  An  +  At  (Diag  (X-xAXn)  -  An)  ,    and    Xn+1  =  Xn  +  AtXnF  (Xn,  An) . 

(2.15) 
We  will  use  the  values  of  An+i,  and  Xn+\  from  (2.15)  to  create  an  iterative 
method  to  compute  a  block  diagonalization  of  an  n  x  n  real  matrix  A.  But  first, 
we  will  discuss  how  to  compute  a  block  diagonalization  of  an  upper  triangular  ma- 
trix. For  any  upper  triangular  matrix  T,  its  eigenvalues  are  the  diagonal  elements. 
We  will  show  how  to  compute  a  block  diagonalization  Y~lTY  —  D  of  T,  where 
D  =  diag(  Dn,  D22t  ■  ■  ,Du  )  with  the  property  that  each  diagonal  block  is  upper 
triangular,  and  the  eigenvalues  of  Z),-,  and  Dj  for  i  ^  j  are  distinct.  Let  Y  =  In,  and 
<*»j  =  \Ui~tjj\-  Given  a  tolerance  tol,  if  a^  >  tol,  then  define  z  = — .  Let  Wij 

*ii  ~  *jj 

be  equal  to  the  identity  matrix  except  for  the  (i,j)  element,  which  is  z.  Then  the 
product  W^  TWij  is  an  upper  triangular  matrix,  whose  (i,j)  element  is  zero.  Next 
update  Y  by  Ynew  =  YoldWij.  We  use  this  technique  to  zero  Uj,  whenever  a,^  >  tol, 
1  <  z  <  J  <  n.  An  efficient  way  to  zero  ttJ,  when  a,j  >  tol,  1  <  i  <  j  <  n,  is  to  start 
from  the  bottom  right  corner  of  the  matrix.  Zero  ttJ  in  the  decreasing  order  of  the 
row  index  i  until  i  =  1.  If  a^  <  tol  is  encountered,  then  momentarily  stop  zeroing 
on  the  jth  column  and  go  to  the  (j  —  l)th  column.  Continue  this  process  all  the 
way  to  the  second  column.  Once  the  second  column  is  done,  then  in  the  increasing 
order  of  the  column  index  j  go  to  those  columns,  where  some  of  the  ti3  are  not  zeroed 
in  the  first  round,  even  though  the  corresponding  a,-j  are  greater  than  the  tol.  This 
time  also  zero  <2J  in  the  decreasing  order  of  the  row  index  i  until  i  =  1,  whenever 
the  corresponding  a^-  >  tol.  Then  use  permutations  of  rows  and  columns  to  obtain 
D\\,  D22,  ■  •  • ,  Du.  Use  these  permutations  to  Y  to  switch  rows  and  columns  of  Y. 
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In  fact,  the  above  technique  can  be  extended  to  a  k  x  k  quasi-upper  triangular 
matrix  T  €  R"xn.  In  this  case  Wij  is  equal  to  the  k  x  k  block  identity  matrix 
except  for  the  (»,j)  block,  which  is  Z,  where  Z  is  the  solution  to  the  matrix  equation 
TaZ  —  ZTJ3  =  —Tij.  We  summarize  the  block  diagonalization  of  T  in  the  following 
algorithm. 

Algorithm  2.2.1  Given  a  k  x  k  quasi-upper  triangular  matrix  T  £  R"x",  and  a  coa- 
lescing tolerance  tol  greater  than  the  square  root  of  the  unit  roundoff,  the  following 
algorithm  computes  a  block  diagonalization  of  T. 

S  —  In    {  S  is  the  n  x  n  identity  matrix.  } 
row  =  [  ];  col  =  [ 
for  j  =  k  :  -1  :  2 

i  =  j  —  1;  minim  =  'Itol 
while  minim  >  tol  A  ?  >  1 

ctij  =  min{  |<7  —  u?\    :   a  6  A(7'll),  and  ix.'  6  \{TU)  }. 
if  Ojj  <  ^o/ 

row  =  [  i  row  ];  col  =  [  j  col  ] 

{  row,  and  col  are  two  arrays,  whose  elements  are  the  row  index  i. 
and  the  column  index  j  respectively,  of  an  entry  Tij  of  T  which  will 
nut  be  replaced  by  a  zero  matrix.  } 
minim  =  q,j 
else 

Solve  T,,Z  —  ZTjj  =  —Tij  for  Z,  and  then  form  11',,. 

t=\v-]t\\\j:    .s^.s'ir,, 

end 

i  =  i  —  1 
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end 
end 

m  =  length(row) 
for  /  =  1  :  m 

i  =  row(l)  —  1;  j  =  col(/) 
while  i  >  1 
if  atj  >  tol 

T  =  W^TWif,    S  =  SWl3 
end 

i  =  i  —  1 
end 
end 

Let  /  =  k,  and  suppose  AT  is  an  array  of  k  elements  whose  ith  element  is  the 
dimension  of  the  ith  block  of  T  along  the  diagonal.  For  i  <  j,  define  a,j  as  in  the 
above  code.  If  atJ  <  tol,  then  merge  blocks  Ttt  and  T3J  to  form  a  single  block  using 
permutations  of  columns  and  rows  of  T.  Use  those  column  permutations  to  S  to 
merge  S>  and  Sj.  Update  koU,  and  NTold  to  obtain  knew \  and  NTnew.  Try  the  above 
merging  technique  for  all  possible  combinations  of  1  <  i  <  j  <  I. 

Now  we  are  ready  to  compute  a  block  diagonalization  of  an  n  x  n  nondefective 
matrix  A  using  the  following  procedure: 

Algorithm  2.2.2  (Dynamical  Eiqencomputations)  Given  a  nondefective  matrix  A  € 
Rnxn,  a  tolerance  tol  greater  than  the  unit  roundoff,  a  coalescing  tolerance  toll 
greater  than  the  square  root  of  the  unit  roundoff,  a  constant  stepsize  At  smaller 
than  1,  and  a  starting  guess  diagonal  matrix  A0,  the  following  algorithm  uses  (2.15) 
to  compute  a  block  diagonalization  A  =  XA.X-1. 
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Define  tol2  =  lOOtoll .  Next  we  break  the  sketch  of  the  algorithm  into  several  steps. 

Step  0:  Take  X  =  I,  A  =  A0,  ,  Y  —  X~l ,  XB  =  n  (  the  number  of  blocks  ),  and 
NS  =  [  1.  1, . . . ,  1  ];  an  array  of  n  elements  whose  tth  element  is  the  dimension  of  the 
tth  diagonal  block  of  A. 

Step  1:  Let  m  =  XB.  For  i  <  j,  define  DIF  =  min{  |er  —  u>\  :  a  £  A(A7),  and 
Ui  €  A  (A,)  }.  If  DIF  <  toll ,  then  merge  blocks  A,  and  \}  to  form  a  single  block 
using  permutations  of  columns  and  rows  of  A.  Use  those  column  permutations  to  A' 
to  merge  A",  and  Xj,  and  those  row  permutations  to  Y  to  merge  Y?  and  Yj .  Update 
NBold,  and  XSold  to  get  XBn€W,  and  NSnew.  Try  the  above  merging  technique  for 
all  possible  combinations  of  1  <  i  <  j  <  m.  Using  column  and  row  permutations, 
arrange  blocks  of  A  in  the  decreasing  order  of  sizes  along  the  diagonal.  Use  those 
column  permutations  to  arrange  block  columns  of  X  in  the  decreasing  order  of  sizes, 
and  those  row  permutations  to  arrange  block  rows  of  Y  in  the  decreasing  order  of 
sizes. 

Step  2:  Construct  F{X,  A)  as  follows: 

F    (Y  W-f  °        for  *  =  h 

where  P  =  PtJ  is  the  solution  to  the  matrix  equation  P\:  —  A,\P  =  Y? AXj. 

St(>p  3:  Suppose  \i  =  [Am+i,  Am+2, . . . ,  A,ve]  ,  where  0  <  rn  <  NB  with  dim(A,)  > 
1  for  i  =  1,2,...,  7?i,  and  dim(A,)  =  1  for  i  =  m  +  1,  m  +  2, . . . ,  XB.    Define  d  = 

m 

diag(  YAX )  (/  :  77)  — //.  where  t  =  1  +  y]  NS(i),  and  diag(M)(<  :  n)  is  a  vector  formed 

1=1 
from  t  through  77  elements  of  the  vector  diag(A/).    Our  aim  is  to  find  the  smallest 

s  €  (0, 1)  with  the  property  //,  +  sdi  =  fij  +  sd:.  To  obtain  this  5  we  do  the  following. 

Let  /■  lie  ;ni  array  with  the  property  that  //;.(,)  <  //;.(1+i).  Next  form  the  ratio: 
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and  define  s  =  min{  r,-  :  i  =  1, 2, . . . ,  n  —  t,  0  <  r, <  1  }.  If  s  <  At,  then  take  a  time 
step  of  size  s  instead  of  the  normal  time  step  At.  Otherwise  take  the  normal  time 
step  At. 

Step  4:  Update  A,  A',  and  Y  as  follows: 

Aneu,    =    Xoid  +  Atf  (Y?td)  T  AXf  -  Af )  ,      j  =  1, 2, . . . ,  NB, 

Xnew      =      Xold  ^j  +  AfF  ^Xold^  Aoldj  j  ^ 
■ynew       /  yneto\- 1 

Step  5:  Consider  the  block  Aj  with  1  <  j  <  m,  where  m  is  defined  as  in  Step 
3.  Let  Aj  =  QUQT  be  a  Schur  decomposition  of  Ar  Then  U  =  D  +  N,  where 
D  =  diag(  Uix,  U22,  ■  •  ■ ,  Uu  ),  and  Ar  is  the  strictly  upper  triangular  part  of  U.  Uu  is 
either  a  1  x  1  matrix,  or  a  2  x  2  matrix.  Let  M/  be  an  array  whose  ith.  element  is  the 
dimension  of  the  block  Uu.  Use  Algorithm  2.2.1  with  tol2  as  the  coalescing  tolerance 
to  reduce  U  to  a  block  diagonal  matrix,  to  update  the  array  NU  and  the  number  of 
blocks  /,  and  to  get  the  invertible  matrix  S.  Replace  Aj  by  U,  X3  by  X3QS,  V)  by 
S~lQTYj,  NB  by  NB  +  I  -  1,  and  the  size  of  the  block  Aj  in  NS  by  NU.  Try  the 
above  decoupling  procedure  for  all  j  such  that  1  <  j  <  m. 

Step  6:  Compute  /  =  ||A-  XAY\\F.  Goto  Step  1  until  /  <  tol. 

Example  2.2.1  If  Algorithm  2.2.2  is  applied  to 


A  = 


3-3    4 

-1     -2    3 

3       4    2 


with      Aq 


3 

0 

0  " 

0 

—2 

0 

0 

0 

2 

tol  =  0.01,  toll  =  10"4,  and  At  =  0.05,  then  after  130  iterations  the  diagonal  matrix 
Aq  converges  to  A  =  diag(  6.11220,  -5.34158,  2.22938  ). 


Example  2.2.2  Similarly,  if  we  apply  Algorithm  2.2.2  to  the  matrix 
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A  = 


-2     -4     1 

2  4     1 

3  -1    2 


with      A0 


-2    0    0 

0    4    0 
0    0    2 


tol  =  0.01,  toll  =  10-4.  and  At  =  0.05,  then  it  takes  131  iterations  to  the  diagonal 
matrix  A0  to  converge  to 


D  =  diag 


-37.00078    27.61720 
-58.11556    42.99974 


•1.99S96 


The  eigenvalues  of  D  are  A  =  [  2.99948  +  2.22976*,  2.99948  -  2.22976*,  -1.99896  ]. 


Example  2.2.3  If  we  apply  Algorithm  2.2.2  to  the  matrix 


1  2     -1  1        3  -1  -2  -4 

4  2        1  -1     -3  1  -2  -1 

-1  -13  11  -2  -2  2 

3  -3        1  4-2-1  1-1 

4  3  -3  -1  -1  2  2  1 
-3  -1  -2  1  -1  -2  4  3 
-1  -2  -2  2  2  3-4  1 
-2  2-1  1     -3  3  4  -3 


with  A0  =  diag(  1,  2,  3,  4.  -1,  -2.  -4,  -3),  tol  =  0.01,  toll  =  10"',  and  A/  =  0.01, 
then  it  takes  802  iterations  to  the  diagonal  matrix  Aq  to  converge  to 


I)     —     diag 


-109.0896      62.4219 
-193.3755     110.5633 

-1.2449    -196.8223 

0.1837    10.7697 


15.3313  -35.8551 
13.3030  -28.3276 


.  0.X312,  —1.8365 


I  he  eigenvalues  of  I)  arc  A  =  [0.7369  +  3.0020/,  0.7369-3.0020/,  -6.4982 +  0.67 11)/. 
-6.4982  -  0.6740/,  4.7624  +  0.2597/,  1.7621  -  0.2597/.  6.8342,  -4.8365  ]. 

This  method  lias  the  following  disadvantage.  As  the  size  of  the  given  matrix  A 
grows,  we  need  to  take  a  stepsize  smaller  than  0.1.  However  with  a  stepsize  smaller 
t  han  0. 1 .  too  many  iterat  ions  are  required  to  obtain  a  few  decimal  places  of  accuracy. 
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2.3     A  Differential  Equation  Approach  to  Eigencomputations  with  Armijo's  Stepsize 

In  the  Differential  Equation  Approach  to  Eigencomputations,  we  found  that  with 
a  constant  time  step  At,  which  is  usually  a  small  number,  too  many  iterations  are 
required  to  achieve  a  few  digits  accuracy  in  the  results.  So  here  we  will  modify  the 
Differential  Equation  Approach  to  Eigencomputations  by  varying  the  time  step  At 
in  each  iteration.  We  plan  to  achieve  this  by  using  Armijo's  rule  from  optimization 
theory. 

Let  A  be  an  n  x  n  real  nondefective  matrix.  We  will  find  an  iterative  method  to 
compute  a  block  diagonalization  of  the  form  A  =  XAX'1,  where  A  =  diag(  Ai,  A2, 
. . . ,  At  )  is  a  block  diagonal  matrix,  and  X  =  [  Xi ,  X2, . . . ,  Xt  ]  is  a  compatible  block 
column  matrix  such  that  AXj  =  XjAj.  As  in  Section  2.2,  for  the  block  eigenvalue 
problem,  the  differential  equations  that  we  solve  are: 

A(t)  =  -A(t)  +  Diag  (Xity'AXit))  ,       and      X(t)  =  X(t)F(X(t),  A(i)) , 

where  Diag(M)  is  a  block  diagonal  matrix  formed  from  the  diagonal  blocks  of  a 
block  matrix  M,  Fjj(X(t),  A(t))  is  a  square  zero  matrix,  and  FlJ{X{t),  A(t))  for  i  ^  j 
is  the  solution  B  to  the  matrix  equation  BAj(t)  -  Az{t)B  =  Yx{t)T AX3(t).  Here 
Yt{t)T  denotes  the  ith  block  row  of  X(i)-1.  Hence  the  Euler  approximation  to  the 
differential  equations  can  be  expressed  as: 

An+1  =  An  +  At  (Diag  (X-'AXn)  -  An)  ,     and     Xn+1  =  Xn  +  AtXnF  (Xn,  An) . 

(2.16) 
In  Euler's  method  the  normal  stepsize  At  is  constant.    Here  we  will  vary  the 
stepsize  At  in  each  iteration.  Let  s  be  a  positive  parameter,  and  let  Cl(s),  and  Z(s) 
be  defined  by 

tt(s)  =  An  +  s  (Diag  (A;MA'„)  -  An)  ,     and     Z(s)  =  Xn  +  sXnF  (Xn,  An) .  (2.17) 
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So  H(0)  =  An,  Z{0)  —  Xn,  and  fi(A<)  =  An+i,  Z(At)  =  A'n+J,  the  matrices  gener- 
ated by  a  Euler  step.  Define  f(s)  =  ||G(s)||F,  where  G(s)  =  A  -  Z(s)Q(s)Z(s)-1 . 
Hence  /(0)  =  \\A  -  XnAnX^  \\F,  and  /(A/)  =  \\A  -  Xn+1  An+1  X£,  ||F. 

If  the  starting  guess  block  diagonal  matrix  Ao,  and  the  starting  guess  invertible 
matrix  A'o  are  good  approximations  of  A,  and  X  in  the  factorization  A  =  AAA "_1. 
then  we  must  have  f(At)  <  /(0).  Here  our  goal  is  to  find  an  a,  0  <  .s  <  1,  for  which 
/(■s)  5-  /(0)  holds.  To  this  end,  we  use  Armijo's  rule  from  optimization  theory.  In 
Armijo's  rule,  we  determine  s  in  the  following  way.  Evaluate  /(.s)  at  3  =  1,=,  4,. . ., 
stopping  when 

f(s)  <  (l  - 1)  /(0). 
Simplifying  the  above  inequality  we  get 

/M-/(Q)<_l/(0), 

s  2 

It  turns  out  ([HagSS,  page  178  —  80])  that  to  use  the  above  rule,  we  must  have 
/'(0)  =  —  /(0).  So  our  next  aim  is  to  determine  /'(0),  when  f(s)  =  \\G(s)\\f,  and 
G(s)  =  A  —  Z(s)Q($)Z(s)~1 .  Suppressing  the  subscripts  of  A„,  and  A'n  in  (2.17),  we 
get 

ft(.s)  =  A  +  s  (Diag  (X^AX)  -  A)  ,    and  Z(s)  =  X  +  sXF  (X,  A) . 

Before  we  find  /'(0),  we  want  to  prove  the  following  fundamental  result. 

Lemma  2.3.1  Let  G(s)  be  an  n  xn  complex  matrix,  ichose  eh  mints  art  difft  n  ntiablt 
functions  of  s.   If  in  define  f(s)  =  \\G(s)\\f,  then 

/(0)/'(0)  =  tract  (G(0)H^G(s)\tJ\  .  (2.18) 

Proof  of  Lemma  2.3. 1  Clearly 

f(s)2  =  \\G(s)\\2F  =  trace  (G(s)HG(s))  .  (2.19) 
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Differentiating  (2.19)  with  respect  to  s,  we  have 

2f(s)f'(s)     =    trace  (±  (G(sf)  G(s)  +  G(s)H  ±G(sfj 

=    trace  ({4;G(s)\     G(s)  +  G(sf  ^G(s)) 

=    2tr<ice  (g(s)h-^-G(s)Y 

After  simplification  setting  s  =  0,  we  get 

/(0)/'(0)  =  trace  (G(0)H  ^G(s)\s=o)  •  ° 

Now  with  Sl(s)  =  A  +  s{Dia,g{X-1AX)  -  A),  Z(s)  =  X  +  sXF(X,A) ,  G(s)  = 
A-  Z(s)ft{s)Z{s)-\  and  f{s)  =  \\G{s)\\F  and  the  result  of  the  above  lemma,  we  are 
in  a  position  to  show  that  /'(0)  =  —  /(0). 

Theorem  2.3.1  Let  A  e  RnXn  be  a  nondefective  matrix.  Suppose  A  =  XAX'1  is  a 
block  diagonalization  of  A,  where  A  =  diag(  Au  A2, . . . ,  At  )  is  a  block  diagonal  matrix 
such  that  A,,  and  A3  for  i  ^  j  have  distinct  eigenvalues,  and  X  =  [  X\,X2,  ...,Xt] 
is  a  compatible  block  column  matrix.  If  we  define  £l(s)  =  A  +  s  (Diag(X~1AX)  -  A), 
Z{s)  =  X  +  sXF{X,A),  G(s)  =  A-Z{s)n{s)Z(s)-1,  and  f(s)  =  \\G{s)\\F,  where 
Diag(M)  is  a  block  diagonal  matrix  formed  from  the  diagonal  blocks  of  the  block 
matrix  M,  FJ3  {X,  A)  is  a  square  zero  matrix,  and  Ftj  (X,  A)  for  i  /  j  is  the  solution 
P  to  the  matrix  equation  PA3  -  AtP  =  Y?AXj,  where  Y?  denotes  the  ith  block  row 
ofX-\  thenf'(0)  =  -f(0). 

Proof  of  Theorem  2.3.1  The  result  (2.18)  of  Lemma  2.3.1  gives 

/(0)/'(0)  =  trace  (g(0)t^G(s)\s=o)  .  (2.20) 
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Differentiating  the  expression  for  G(s)  with  respect  to  s,  we  have 

j-G(«)  =  -^(Z(s))n(s)Z(s)-1-z(s)^(n(s))Z(sr1-z(s)n(s)^-(z(s)-') 

as  as  as  as  v  ' 

=  -±(Z(s))n(s)Z(s)-1  -  z(s)-^(n(s))Z(s)-> 

as  as 

+Z(s)Q(s)Z(srij-(Z(s))Z(s)-1.  (2.21) 

Evaluating  —  G(s)  at  s  =  0,  we  obtain 
as 

^G(s)\s=0  =  -^z(^)|s=on(0)Z(0)-1-z(0)^n(6)|s=oz(0)-1 

+  Z(0)n(0)Z(0)-1^Z(s)|,=oZ(0)-1.  (2.22) 

Differentiating  f2(s),  and  Z(s)  with  respect  to  5  and  then  evaluating  the  derivatives 
at  s  —  0,  we  get 

^(5)|s=0  =  -A  +  Diag(X-M.V),       and      —  Z(s)\a=0  =  XF{X,A). 

Using  these  values  and  values  of  f)(0),  and  Z(0)  in  (2.22).  wo  have 

fsG(s)U=o    =    -XF(X)A)AX~i  -  X  (Diag  [X'Wx)  -  \)  A"1 
+XAX-1(XF(X,A))X~i 

=    -X  (F  {X,  A)  A  -  \F  (X,  A))  A'"1 

-A'Diag  (.VMa)  A'-1  +  A' A  A'-1 .  (2.2:?) 

Next,  let  D  =  F(X,A)A  -  \F(X.\).  Then 

Dii  =  {  FIJAJ-A1F1J     for    i  ?  j\  ('2'2U 

where  Fi}  =  FtJ(X,  A).  According  to  our  assumption  for  t  /  j,  FtJ  is  t ho  solution  P 
to  the  matrix  equation  PA,  -  A,P  =  i/.l.Vj.  So  FijAj  —  A,F,j  =  Y?AXj.  Hence 
after  simplification  (2.21)  gives 

F(A',  A)A  -  AF(A\  A)  =  A'-'.l.V  -  Diag  (x~lAX)  .  (2.25) 
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Using  (2.25)  into  (2.23),  we  have 

^-G(s)\a=0    =    -X(X-1AX-Bmg(x-1AX))X-1 
-XDiag  (A'-MA')  X-1  +  XAX'1 
=    -  [A  -  XAX'1)  .  (2.26) 

Since  G(0)   =  A  -  Z(0)fi(0)Z(0)-1   =  A  -  XAX~\  so  (2.26)  gives  —  G(s)\s=0  = 

as 

—(7(0),  and  with  this  value  (2.20)  reduces  to 

/(0)/'(0)     =    trace  (G'(0)T(-G(0))) 

=   -/(O)2. 

Hence  /'(0)  =  -/(0),  provided  /(0)  ^  0.    □. 

The  above  theorem  implies  that  Armijo's  rule  can  be  applied  to  Algorithm  2.2.2 
mentioned  in  the  Differential  Equation  Approach  to  Eigencomputations. 

A  modified  algorithm  to  find  the  eigenvalues  and  corresponding  eigenvectors  of  a 
nondefective  matrix  A  can  be  obtained  as  follows: 

Algorithm  2.3.1  Given  a  nondefective  matrix  A  G  Rnxn,  a  tolerance  tol  greater  than 
the  unit  roundoff,  a  coalescing  tolerance  toll  greater  than  the  square  root  of  the 
unit  roundoff,  an  invertible  matrix  Xo,  and  a  block  diagonal  matrix  Ao,  the  following 
algorithm  uses  (2.15)  to  compute  a  block  diagonalization  A  =  XAX-1. 

Define  tol2  =  lOOtoll .  Next  we  break  the  sketch  of  the  algorithm  into  several  steps. 

Step  0:  Take  X  =  X0,  Y  =  X-1,  A  =  A0.  Let  NB  denote  the  number  of 
diagonal  blocks  of  A,  and  let  NS  denote  an  array  of  NB  elements  whose  ith  element 
is  the  dimension  of  the  ith  diagonal  block  of  A. 

Step  1:  Let  m  =  NB.  For  i  <  j,  define  DIF  =  min{  \a  —  u>\  :  a  €  A(Aj),  and 
ui  £  A(A;)  }.  If  DIF  <  toll ,  then  merge  blocks  A;  and  Aj  to  form  a  single  block  using 
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permutations  of  columns  and  rows  of  A.  Use  those  column  permutations  to  X  to 
merge  X;  and  Xj,  and  those  row  permutations  to  Y  to  merge  YtT  and  Vj7.  Update 
NBold,  and  NSold  to  get  NBnew,  and  NSnew.  Try  the  above  merging  procedure  for 
all  possible  combinations  of  1  <  t  <  j  <  m.  Using  column  and  row  permutations, 
arrange  blocks  of  A  in  the  decreasing  order  of  sizes  along  the  diagonal.  Use  those 
column  permutations  to  arrange  block  columns  of  X  in  the  decreasing  order  of  sizes, 
and  those  row  permutations  to  arrange  block  rows  of  Y  in  the  decreasing  order  of 
sizes. 

Step  2:  Construct  F(X,  A)  in  the  following  way. 
Take  F:j  [X,  A)  =  0,  which  is  an  rrij  x  m:  zero  matrix,  and  for  i  ^  j,  Fl3  (A',  A)  can 
be  determined  by  using  the  following  loop: 

for  j  =  1  :  NB 
for  i  =  1  :  NB 

if  i  ^  J 

Solve  PAj  -  \,P  =  Y?AXj  for  P. 
Fij(X,A)  =  P. 
end 
end 
end 

Step  3:  Let  f(s)  =  \\G(s)\\F,  where  G{s)  =  A  -  X(s)A(s)X(s)-\  X(s)  = 
X(I  +  sF),  and  A(s)  =  A  +  s  (Diag  (X'-'AX)  -  A).  Evaluate  f(s)  at  5  =  1.  ~r  \.  •  •  • , 
stopping  when 

/(«)<  (l  - 1)  /(0).  (2.27) 

Let  z  be  the  first  value  of  s  for  which  (2.27)  is  true. 
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Step  4:  Suppose  fi  =  [Am+1,  Am+2, . . . ,  Awe]    ,  where  0  <  m  <  A^5  with  dim(A,-)  > 
1  for  i  =  1,2, . . . ,  m,  and  dim(A,-)  =  1  for  i  =  m  +  1,  m  +  2, . . . ,  NB.    Define  d  = 

m 

diag(yAY)  (t  :  n)—fi,  where  <  =  1  +  22  NS(i),  and  diag(A/)(£  :  n)  is  a  vector  formed 

i=l 

from  t  through  n  elements  of  the  vector  diag(Af ).  Our  aim  is  to  find  two  indices  il, 

and  iu  as  follows.    Let  k  be  an  array  with  the  property  that  fiui)  <  fikli+i)'    Next 

form  the  ratio: 

_  fik(i+i)  ~  Vk(i) 
dk(i)  —  djt(t+i) 

and  let  r,0  =  min{  r,-  :  i  =  1,2, . . . ,  n  —  t,  0  <  r,-  <  1  }.   Let  il  =  t  —  1  +    min{&(i0), 
fc(?'o  +  1)  },  and  n*  =  i  —  1    +  max{  k(i0),  k(i0  +  1)  }. 
Step  5:  Update  A,  X,  and  F  as  follows: 

old     I     _  |   (\'old\         a  \rold  \old 


Af"    =    A""1  +  ^  I  (F/'"]    /LY?"  -  A°/d)  ,      j  =  1,2,..., NB, 

Xnew      =      Xold  ^  +  zp  ^yoM    Aold^  ^ 
ynew      /  -ynew  \~^ 

Step  6:  Consider  the  block  Aj  with  1  <  j '  <  m,  where  m  is  defined  as  in  Step 
4.  Let  Aj  =  QUQT  be  a  Schur  decomposition  of  Aj.  Then  U  =  D  +  N,  where 
D  =  diag(  U\\,  U22,  •  •  •  1  Uu  ),  and  N  is  the  strictly  upper  triangular  part  of  U.  Uu  is 
either  a  1  x  1  matrix,  or  a  2  x  2  matrix.  Let  M7  be  an  array  whose  z'th  element  is  the 
dimension  of  the  block  Uu.  Use  Algorithm  2.2.1  with  tol2  as  the  coalescing  tolerance 
to  reduce  U  to  a  block  diagonal  matrix,  to  update  the  array  NU  and  the  number  of 
blocks  /,  and  to  get  the  invertible  matrix  S.  Replace  Aj  by  U,  X3  by  X3QS,  Yj  by 
S-xQTYj,  NB  by  NB  +  /  -  1,  and  the  size  of  the  block  Aj  in  NS  by  NU.  Try  the 
above  decoupling  procedure  for  all  j  with  1  <  j  <  m. 

Step  7:  Using  row  and  column  permutations  merge  A,;)t7  and  A!Ujtu  to  form  a  2  x  2 
diagonal  block.  Use  those  column  permutations  to  X,  and  those  row  permutations 
toF. 


42 


Step  8:  Compute  /(0)  =  ||A  -  X\Y\\F.  Goto  Step  1  until  /(0)  <  tol. 


Example  2.3.1  If  we  apply  Algorithm  2.3.1  to 

with      A0  = 


A  = 


0  1  1 
-2  -2  -1 
-2    -1        0 


0  0  0 
0  -2  0 
0        0    0 


A'o  =  /.  to!  =  10  6,  and  toll  =  10  4,  then  after  42  iterations  the  diagonal  matrix  Ay 
converges  to 


D  =  diag 


-6.01310    4.53255 
-5.98585    4.01310 


,  0 


The  eigenvalues  of  D  are  A  =  [-1  +  1.41421*,  -1  -  1.41421/,  0]. 


Example  2.3.2  If  Algorithm  2.3.1  is  applied  to 

with      A0  = 


.4  = 


-2-1        0 
0        1        1 
-2    -2    -1 


_2 

0 

0  " 

0 

1 

0 

0 

0 

-1  _ 

A'o  =  /,  the  tolerance,  and  the  coalescing  tolerance  are  as  in  Example    2.3.1.  then 
after  196  iterations  the  diagonal  matrix  Ao  becomes 

D196  =  diag(  -1.04694840,  0.00224879,  -0.95530039  ). 

In  the  following  4  successive  iterations  D196  changes  to 

D197  =  diag  (-1.04676152,  0.00223123, -0.9554697]  ) 

Digs  =  diag  (-1.04657539,  0.00221380,  -0.95563841) 

Dl99  =  diag  (-1.04639002,  0.00219651,  -0.95580649) 

Ihoo  =  diag  (-1.04620540,  0.00217935,  -0.95597395). 

In  the  above  iterations  a  =  —  is  the  Armijo's  stepsize.  From  the  diagonal  matrices 

Digs, D20o,  it  is  clear  thai  the  first,  and  the  third  entries  along  the  diagonal  are 

slowly  approaching  each  other. 
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Example  2.3.3  Consider  the  matrix 


5 

4 

9 

3 

3 

7 

1 

2 

2 

6 

7 

3 

-2 

_2 

-5 

6 

8 

1 

-1 

4 

3 

-6 

-9 

6 

3 

7 

7 

-1 

-4 

-3 

4 

4 

2 

1 

6 

-2 

_2 

-2 

6 

8 

9 

2 

7 

4 

3 

5 

-3 

-5 

3 

If  we  apply  the  QR  method  with  double  implicit  shift  with  tol  =  10  6  to  A,  then 
after  11  iterations,  we  obtain  a  real  Schur  form  ST AS  =  U,  and  the  eigenvalues 
of  U  are  A  =  [  10.8425  +  7.9032/,  10.8425  -  7.9032?:,  -2.6168  +  8.7556?',  -2.6168  - 
8.7556?,  3.5414  +  1.2899?,  3.5414  -  1.2899?:,  8.4658  ].  Next,  we  use  the  technique 
discussed  prior  to  Example  2.1.1  to  find  corresponding  eigenvectors.  Here  for  each 
pair  of  complex  conjugate  eigenvalues,  we  solve  the  equation  Bx  —  0  only  for  one, 
and  then  take  the  real  and  the  imaginary  parts  of  x  as  two  vectors  corresponding 
to  the  complex  conjugate  pair  of  eigenvalues.  We  denote  the  matrix  of  vectors  by 
X.  Consider  the  following  perturbed  matrix  A\,  which  we  obtain  by  perturbing  the 
elements  of  A: 


Ai  = 


5.09 

4.07 

9.06 

3.06 

3.03 

2.02 

2.04 

6.10 

7.03 

3.06 

-4.98 

6.04 

8.06 

1.05 

-0.99 

-5.97 

-8.95 

6.01 

3.05 

7.07 

-3.91 

-2.99 

4.10 

4.10 

2.06 

7.05 

1.09  " 

1.96 

-1.96 

4.02 

3.01 

7.00 

-0.91 

1.09 

6.01 

9.04 

2.02 

4.99 

3.01 

-1.93    -1.94    -1.96    6.01       8.08 
7.02       4.08       3.01    5.02    -2.98 


If  we  use  Algorithm  2.3.1  to  Ax  with  X0  =  X,  A0  =  X~1AX1  tol  =  10"6,  and 
toll  =  10-4,  then  after  3  iterations  we  get  a  block  diagonalization  Y  A\Y  =  D\, 
whereas  if  we  apply  the  QR  method  with  double  implicit  shift  to  STAiS,  then  after 
9  iterations  we  obtain  a  real  Schur  form  QT(STA\S)Q  =  R. 


Example  2.3.4  Consider  the  matrix 


11 


.4  =  V'diag(  1,  2,  1,  3,  2,  1,  -1  )Y 


-l 


where  Y  —  Qd\a.g(  1,  2,  3,  4,  —1,  —2,  —3  )QT,  and  Q  is  an  orthogonal  matrix.  If  we 
apply  the  QR  method  with  double  implicit  shift  with  tol  =  10-6  to  .4.  then  after 
4  iterations,  we  obtain  a  real  Schur  form  S  AS  =  U.  Next,  we  use  the  technique 
discussed  prior  to  Example  2.1.1  to  find  corresponding  eigenvectors.  We  denote  the 
matrix  of  eigenvectors  by  X.  Consider  the  matrix 

4  5  9  5  0  9  4 
2  5  19    7  2  8 

5  3  5  19  3  3 
9  10  4  S  6  4  4 
9  5  3  8  7  5  5 
1  3  9  S  7  6  5 
9  15  10  8  3 

If  we  use  Algorithm  2.3.1  to  the  perturbed  matrix  A\  =  A+  P/1000  with  A'0  =  X, 
A0  =  A'-M.Y,  tol  =  10"6,  and  toll  =  10~\  then  after  3  iterations  we  obtain 
a  block  diagonalization  Z~l A\Z  =  D\,  whereas  if  we  apply  the  QR  method  with 
double  implicit  shift  to  STA\S,  then  after  6  iterations  we  obtain  a  real  Schur  form 
Q*(STAlS)Ql  =  Ri. 

Although  this  method  can  be  applied  to  find  eigenvalues  of  some  matrices  (  Exam- 
ple 2.3.1,  )  in  general  it  does  not  work  for  all  matrices.  For  example,  if  a  matrix  has 
equal  eigenvalues,  or  the  size  of  the  matrix  is  large,  then  the  method  does  not  work  ( 
Example  2.3.2.  )  In  all  of  these  cases,  two  1  x  1  diagonal  blocks  approach  each  other, 
but  as  the  iterations  continue,  the  rate  at  which  they  approach  each  other  gets  slower 
and  slower.  However  if  by  another  method,  a  block  diagonalization  A  —  AAA-1  can 
be  computed,  then  we  can  use  A  as  the  starting  guess  block  diagonal  matrix,  and 
X  as  the  starting  guess  invert ible  matrix  to  compute  a  block  diagonalization  of  the 
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perturbed  matrix  A  +  eE,  where  E  is  an  arbitrary  matrix,  and  e  is  a  small  scalar  ( 
Example  2.3.3  and  2.3.4.  ) 

2.4     Convergence  of  Block  Diagonalization  of  a  Matrix 

In  Section  2.1,  we  developed  Algorithm  2.14  to  block  diagonalize  a  nondefective 
matrix.  Here  we  will  examine  the  speed  of  convergence  of  Algorithm  2.14. 

Let  A  be  an  n  x  n  nondefective  matrix,  and  let  A  =  YAY'1  be  a  block  diag- 
onalization of  A,  where  A  =  diag(  Ai,  A2, . . . ,  A*  )  is  a  block  diagonal  matrix,  and 
Y  =  [  l'i,  Y2, . . .  ,Yk]  is  an  associated  block  column  matrix  such  that  AYj  =  YjA3. 

Define  X  =  Y  (I  +  E) ,  Xnew  =  X  {I  +  F),  where  E  is  an  n  x  n  perturbation 
matrix  such  that  Diag  (X~1AX)  =  D  =  diag(  Di:  D2,  . . . ,  Dk  )  is  an  approximate 
block  diagonalization  of  A,  Di,  and  Dj  for  i  ^  j  have  distinct  eigenvalues,  and 
F  =  [Fij)  is  a  k  x  k  block  matrix  such  that  Fjj  is  a  square  zero  matrix,  and  F{j  for 
i  ^  j  is  the  solution  P  to  the  matrix  equation  PDj  —  D{P  =  [X  AX)-.  Clearly 
E  =  Y~lX-I.  Assume  X  is  close  to  Y.  We  will  show  that  ||Aneu,-V'||  =  <9(||£||2)  = 
0(||A-F||2),  and  ||Diag(A-MA)  -  A||  <  0(\\X-Y\\2).  That  means  the  block 
diagonalization  method  converges  locally  quadratically. 

Theorem  2.4-1  Let  A  €  Rnxn  be  a  nondefective  matrix  and  suppose  A  =  YAY  is 
a  block  diagonalization  of  A,  where  A  =  diag{  Al5  A2,  . . . ,  A&  )  is  a  block  diagonal 
matrix,  and  Y  =  [Y\,  Y2,  . . . ,  Yk  }  is  an  associated  block  column  matrix  such  that 
AY)  =  YjAj,  and  A,-,  and  Aj  for  i  ^  j  have  distinct  eigenvalues.  Suppose  the  pertur- 
bation matrix  E  E  Rnxn  is  sufficiently  small  to  ensure  that  X  is  close  to  Y  in  the 
equation  X  =  Y  (I  +  E),  and  Diag{X~l  AX)  =  D  is  an  approximate  block  diagonal- 
ization of  A.  Define  Xnew  —  X  (I  +  F),  where  Fij  for  i  =  j  is  a  square  zero  matrix, 
and  for  i  ^  j  is  the  solution  P  to  the  matrix  equation  PD3  —  D,P  =  (X~1AX)i-. 


46 

Then 

\\Diag  (-V-M.Y)  -  A||  <  0  (\\E\\2)  ,       and      \\Xnew  -  Y\\  =  O  (\\E\\2)  . 
Proof  of  Theorem  2-4-1  Assume  that  if  |pu|  <  |g,j|  for  i,j  =  1,2 ,  n,  then  ||P||  < 

||g||.  Now 

X~lAX    =    {I  +  E)~1Y-1AY(I  +  E) 

=    (I  +  E)-1  A  (I  +  E). 

Since  (/  +  Ef1  =  I  -  E  +  E2{I  +  E)~\  so 

X~XAX    =     [/-  E  +  E2(I  +  Eyl]\(I  +  E) 

=    A  +  AE-EA-  EAE  +  E2  {I  +  E)~l  A  (I  +  E) . 

Let  L  =  -EAE  +  E2  {I  +  E)~l  A  (/  +  E).  Then 

||I||    <    \\-EAE\\  +  \\E2(I  +  ETx  A{I  +  E)\\ 
<    IIA||||E||2  + 


iiA||  PIP  (1  +  \\E\ 


I -PI 

2||A||||£||2 


i  -  Pll 

Since  A'-M.Y  =  A  +  AE  -  EA  +  L,  so 

Diag  (.VM.V)  =  A  +  Diag(A£  -  EA)  +  Diag(£).  (2.2S) 

Next  normalize  Y  such  that  the  diagonal  blocks  of  Y~lX  are  identity  matrices. 
That  is  DiagtV'-'A)  -  /  =  0.  Let  Z  =  V_I.  then  from  the  relation  E  =  Y~XX  -  I. 
we  have 


0  for     /'  =  j 

and  so,  if  C  =  AE  -  EA,  then 


7--\   ZjX}     for     i*j, 


(   (J  for     i  =  j, 

Ut}-  \  AiEij-EijAj    for    i^j. 
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Thus  Diag(A£  -  E\)  =  Diag(C)  =  0,  and  from  (2.28)  we  have  Diag(A'-MA')  -  A  = 
Diag(L).  Hence 

||Diag(X-Mx)-A||     =     ||Diag(I)|| 

<     ||JL||<0(||E||2),  (2.29) 

which  proves  the  first  part  of  Theorem  2.4.1. 

Since  Diag(A'-Mx)  =D  =  diag(  Du  D2, . . . ,  Dk  ),  and  F  =  (FtJ)  is  a  k  x  fc  block 
matrix,  where  FJ3  =  0,  and  for  i  ^  j,  FtJ  is  the  solution  B  to  the  matrix  equation 
BDj  -  DiB  =  (X-MA%,  so  G  =  FD  -  DF,  where 

r  J  0  for    i  =  j, 

Uij    ~    \  FijDj-DiFij    for    i^j, 

j  0  for    t  =  j, 

~     \  {X-xAX)a    for    i  +  j. 

Hence  FD  -  DF  =  G  =  nondiag(Ar_1  AXJ,  where  nondiag(M)  is  a  block  matrix 
formed  from  the  block  matrix  M  by  replacing  the  diagonal  blocks  by  zero  matrices. 
Since  A"-1  AX  =  A  +  A£-£A  +  L,  hence  nondiag(,Y_1  AX)  =  nondiag(  A  +  \E-E\ 

-\-L  ).  That  is 

FL>-£>F  =  nondiag(A£-£A  +  L).  (2.30) 

For  i  ^  j,  taking  the  (i,  j)  block  on  both  sides  of  (2.30),  we  obtain 

F^Dj-DiFij    =    -{EvAj-Ai^  +  Lu.  (2.31) 

Let  H{j  =  EijAj  -  AiEij,  and  let  Dj  =  UEU'1 ,  and  Dt  =  WlV'1  be  diagonal- 
izations  of  Dj,  and  Dt  respectively.  Substituting  these  values  in  (2.31),  and  then 
simplifying,  we  have 

V-^FijUH  -  nV^FijU  =  -V-lHl3U  +  V^UjU. 
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Let  W  =  V-lFijU.  Then  the  above  equation  reduces  to  WZ  -  Q\V  =  -V^HijU  + 

V~xLijU.    Now  solving  for  W  =  (l0jm),  we  get  wtm   =   -p,m  +  qlm,  where  plm   = 

,-i—UsLt   qim  = V-Um.^    I  =   1,2,..., m„  and  m   =   1,2 m..    Let 

CTm  —  u,'/  crm  -  ui 

P  =  (p/m),  and  Q  =  (g/m).  Then  simplifying  the  relation  W  =  —  P  +  Q,  we  obtain 

F.j  =  -VPU'1  +  VQU~l.  (2.32) 

Next,  let  Aj  =  c/iEit/x-1,  and  A,  =  ViHjVi-1  be  diagonalizations  of  As,  and  A, 
respectively.  Using  (2.29),  we  have  ||A,  -  All  <  0(||£||2),  where  /  =  i.j.  So  there 
exist  6i,e2,ri,  and  T2  such  that  Uj  =  U  +  0X,    S,  =  S  +  T,.    \i  =  \'  +  Q2,  and 

ft,  =ft  +  r2)  and  ||0(||  <  0(||£||2)  and  ||I\||  <  0(||£||2),  where  t  =  1,2.  Now 

a,  =  (^4-e1)(s  +  r1)(t/  +  01)-1 

=  (ue  +  ut.  +  q^  +  q^^u-1  (z  +  e.r-1)"1 

=  (uzu-y  +  ['!,(/-'  +  ©iEcr1  +  e^jt'-1)  (/  -  Gjf-1  (/  +  q^-1)-1) 

where    A,     =     c/IV/-1   +  GiEtM   +  e^itf"1   -  (c/St/"1  +  cTjc/"1  +  G^'"1 
+  Gir,f'-1)G1^-1(/  +  G1f/-1)"1-  Hence 

l|A,||   <   Wur.u-'W  +  IIQiSt-'-1!!  +  IIQ.r.r'-1!!  +  \\ueu~1  +  jt.c/-1  +  e.sr-1 
+  Q1rlL/-1||||G1r'-,||||(/  +  G1r'-1)"1|| 

<    0(||/-||2). 
Similarly,  A,-  =  VW"1  +  A2  with  ||A2||  <  0(||£||2).  Since  lltJ  =  EX}.\J  -  \,E,},  so 

n.j  =  eij(c/s^-1  +  a1)-(iq\-,  +  a2)eij 

V-'H.jU    =    V-lEtJUE  -  tlV^EijU  +  I'"1  (JS^-Ai  -  S2EI})U.      (2.33) 

Taking  the  (l.ru)  entry  on  both  sides  of  (2. •'{•'{),  we  have 
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After  simplifying  the  last  equation,  we  get  pim  = —  =  \V~1EijU)      +s/m, 


where  s/m  =  - 1^2^11 "<"l^"^i    Let  5  =  (s/m),  then  P  =  V^E^U  +  S, 


(V-1(ElJA1-A2ElJ)U) 
and  with  this  value  of  P  (2.32)  reduces  to 


-l 


Fa    =    -VV-'E^UU-1  +V(-S  +  Q)U 
where  /t,j  =  V(  —  b  +  Q)U     .    bince  s/m  =  — ,  and  g/m  = 

^m  —  Ul 


M    so 


<7m   -  W/ 

||i2.M  <  ||v||/lg:^|(!|gd! llA*li  + 11^1! [1MB ,  HO Ifedl ^/||N|n^n 

V  min|  crm  -  u;/  |  min|  crm  -  w/  |    / 

<     0(||is||2). 

Thus  F  =  -E  +  R,  where  R  =  {RtJ)  with  \\R\\  =  0(||£||2).  Next 

Xnew     =     X(I  +  F) 

=    Y(I  +  E)(I-E  +  R) 
=    Y(j-E2)+Y(I  +  E)R 
Xnew_Y    =    _ye2  +  Y(I  +  E)R. 

Hence  \\Xnew  -  Y\\  =  0(\\E\\2).  This  proves  the  second  part  of  Theorem  2.4.1.  □ 

The  result  of  the  above  theorem  implies  that  the  block  diagonalization  of  a  non- 
defective  matrix  converges  locally  quadratically. 

2.5     Block  Schur  Decomposition  of  a  Matrix 

The  iterative  methods  we  developed  so  far  to  find  the  eigenvalues  and  correspond- 
ing eigenvectors  of  a  matrix  have  a  little  practical  interests,  unless  we  have  a  good 
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approximation  to  the  matrix  of  eigenvectors,  and  in  some  methods  a  good  approxi- 
mation to  the  eigenvalues  as  well.  Here  we  will  find  an  iterative  method  to  compute 
a  block  Schur  decomposition  of  a  matrix  of  A. 

Let  Abe  annxn  complex  matrix.  Let  A  —  SUSH  be  a  block  Schur  decomposition 
of  A,  where  U  is  a  k  x  k  block  upper  triangular  matrix  whose  (i,j)  block  is  an  m,  x  rrij 

matrix,  and  5  =  [  Si,  S2,  •  •  • ,  Sk  ]  is  a  compatible  block  column  unitary  matrix  such 

j 
that  ASj  =  ^2  SiUij.  It  can  always  be  arranged  so  that  the  eigenvalues  of  Uu,  and  Ujj 

1=1 
for  i  7^  j  are  distinct.  Before  we  derive  the  necessary  equations  to  find  a  block  Schur 

decomposition  of  a  matrix  A,  we  discuss  the  effect  of  perturbations  in  the  elements  of 

A  on  the  columns  of  the  unitary  matrix  S,  and  the  elements  of  the  upper  triangular 

matrix  U  in  its  Schur  decomposition  SH AS  =  U.  That  is  given  an  arbitrary  matrix 

E,  and  for  a  sufficiently  small  e,  we  will  show  in  the  next  theorem  that  there  exist 

continuously  differentiable  functions  S(e),  and  U(t)  with  5(0)  =  5,  and  U(0)  =  U 

such  that  {A  +  eE)S{e)  =  S(e)U{e). 

Theorem  2.5.1  Let  A  be  an  n  x  n  complex  matrix.  Suppose  SH AS  =  U  is  a  block 
Schur  decomposition  of  A  such  that  the  diagonal  blocks  Uu,  <ui<l  I' n  for  i  ^  j  have 
distinct  eigenvalues.  Then  given  an  arbitrary  n  x  n  complex  matrix  E,  and  for  a 
sufficiently  small  e,  there  exist  continuously  differentiablt  functions  S(e),  and  I  (<  ) 
with  5(0)  =  S,  and  U(0)  =  U  such  that  the  following  matrix  equation  holds: 

(A  +  eE)S(e)  =  S(e)U(e).  (2.34) 

Proof  of  Theorem  2.5.1  First  we  discuss  the  effect  of  perturbations  in  the  elements 
of  A  on  the  columns  of  the  unitary  matrix  S  in  its  Schur  decomposition  S  AS  =  V . 
The  columns  S\,s2,. . .  ,sn  o[  S  form  a  basis  for  C".  Given  an  arbitrary  11  X  ;/  matrix 
E,  and  for  a  sufficiently  small  c,  let  S{t)~l(A  +  tE)S(t)  =  ('(()  be  a  triangular 
decomposition  of  A  +  eE  by  an   invertible  matrix  S(e).    Let   D  =   S    (A  +  </•.')>'. 
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and  let  Q((.)~1DQ(e)  =  U(e)  be  the  triangular  decomposition  of  D  by  the  invertible 
matrix  Q(t).  Then  we  have  st(e)  =  Sqi(e),  i  =  1, . . .  ,  n.  The  form  of  the  matrix  D 
is  illustrated  for  a  7  x  7  matrix  with  k  =  5,  such  that  dim(Uu)  =  dim([/22)  =  2,  and 
dimftftf)  =  1,  j  =  3,4,5: 


D  = 


"ll      «12      "l3      "14      U15      U16  Un 

"ll       "23       "24       "25       "26  "27 

"33       "34       "35       "36  "37 

"33       "45       "46  "47 

"55       "56  "57 

"66  "67 

"77 


+  e 


#11  #12  #13  #14  #15  #16  #17 

#21  #22  #23  #24  #25  #26  #27 

#31  #32  #33  #34  #35  #36  #37 

#41  #42  #43  #44  #45  #46  #47 

#51  #52  #53  #54  #55  #56  #57 

#61  #62  #63  #64  #65  #66  #67 

#71  #72  #73  #74  #75  #76  #77 


where  gtJ  =  s^Esj.  Equating  the  jth  column  on  both  sides  of  DQ(e)  =  Q(e)U(e), 
we  obtain 

n  n  j 

]Cu«''?tf(e)  +  eJ2sim(e)  =  Sfla(c)««(e)>  (2-35) 

l=i  1=1  /=1 

i,j  =  1, ...  ,n.  Since  Uj(0)  =  iij,  so  we  must  have  </j(0)  =  e,-,  which  is  the  jth  unit 
vector.  Next  we  normalize  q3{t),  j  =  1, . . .  ,  n  by  setting  qij(e)  =  0,  t  =  1,. .  ,,j  —  1, 
and  taking  1  as  the  largest  element.  For  sufficiently  small  e,  claim  that  qi\{e)  =  1, 
which  is  the  1st  component  of  qi{t).  If  not,  let  qni(t)  =  1-  Then  from  (2.35)  we  get 

n 

"11(e)  -  unn  =  t^2gniqn{€). 
1=1 

Now  letting  e  — ►  0,  we  find  tfn  —  unn  =  0,  a  contradiction.   Next  we  will  show  that 

|<7ni(e)|  <  Mnit    as  t  — >  0,  for  some  Mnl  >  0.   Since  |</ii(e)|  <  1,  so  from  (2.35),  we 

obtain 

n 
|"ll(e)  -  "nn|  knl(e)|    <  <^  |#n/|- 


;=i 


1 


Let  |un(e)  -  unn\  >  -\un  -  unn\,  then  |g„i(e)|  <  MniC,  where 


M, 


nl 


;=i 

|"11   -  "nn|' 


52 


Using  the  above  arguments  to  the  components  qn,  i  =  3 /?  —  1  in  the  decreas- 
ing order  of  the  row  index  t,  we  can  show  that  9.1(e)  ^  1,  and  |9,i(e)|  <  M,](.  where 
M,\  >  0.  Again  from  (2.35),  we  have 

n  n 

(ttU(e)  -  Un)qu(e)  =  ]£"i/9ii(e)  +  t^guqnie). 

1=2  1=1 

n  n 

(«ii(c)  -  un)q2i(()  =  ^2u2iqn{t)  +  tJ2g2iqn(t). 

1=3  /=1 

To  make  the  above  two  equations  consistent  we  take  921(e)  =  0.  Using  similar  argu- 
ments, and  considering  the  elements  in  the  decreasing  order  of  the  row  index  ;.  we 
can  show  that  |c/,j(e)|  <  MtJe  for  some  MtJ  >  0,  j  =  2, . . .  ,n  —  1,  i  —  j  +  1, . . .  ,n. 

To  show  |</i_,-(e)|  <  M>jt,  we  use  the  results  |c/,;(e)|   <  M,-/e,    /  =   1 j  —  1,  and 

|%(0|  <  A/y c,  /  =  i  -f  1, . . . ,  n.  To  keep  the  consistency,  we  take  943(e)  =  0.  Let 

,  r  maX  ^  r 

M  =         .  .      .        Mij. 

Then  \q,j{c)\  <  Me,  j  =  1,. . .  ,n  -  1,  i  =  j  +  1 n. 

Thus  t/u(e)  are  of  order  c  for  j  =  1,. . .  ,n  —  1,  i  =  j  + 1, . . .  ,n,  except  921(e).  and 
943(f)-  which  are  equal  to  zero.  Next  from  (2.35),  we  have 

J-1  n  n 

(Ujj(e)  -  u,,)q,j{e)  +  Y^  qu(e)uij(e)  -    J2   "«%'(*)  _  e5u  =  f    H   5./<7o(e). 

/  =  1  /=i+l  /=J  +  1 


I  he  term  on  the  right  is  of  order  t2,  so 

j-i  \ 

=  0(e2). 


'/  ."  ) :    9ij  +  Y,  u»'9y(0)  -  H  (/.'/(°)"0 


UjJ   ~  U"     \  /=,  +  ]  /=1 

Next  we  discuss  the  effed  of  perturbations  in  the  elements  of  A  on  the  elements 
of  the  upper  triangular  matrix  I  in  its  Schui  decomposition  S  AS  =  V.  As  in  the 
earlier  discussion,  let  D  =  S"(A  +  eE)S,  and  Q(c)~l DQ{c)  =  /"(<  )  be  the  triangular 
decomposition  of  D  by  the  invertible  matrix  (}(()■     To  illustrate,  we  consider  the 
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same  7x7  matrix  D  with  k  =  5.  Here  we  will  use  the  results  of  perturbations  of  the 
columns  of  Q(e).  Putting  j  =  1,  i  =  1,  and  j  =  2,  i  =  1,2  in  (2.35),  we  obtain 

n  n 

X]"n<m(e)  +  e^I#i/<7/i(e)  =  "11(e), 


;=i 


/=i 


X!ui'*2(c)  +  ej^ftztfiafc)  =  "12(e), 


/=2 


1=2 


X!"2/9/2(e)  +  eX^52;9;2(e)  =  "11(e). 
j=2  ;=2 

After  simplifying  the  above  equations,  we  get 

/  n  \  n 

"11(e)  -     tin  +  egu  +J2uuqn(e)     =  e^2guqn(t), 


1=3 


1=3 


tl2(e)  -  ("12  +  e#i2  +  X)"i'9'2(e)  )  =  £Emi2(0- 


/=3 


(=3 


"11(e)  -  fun  +  eg22  +  5^"2i9i2(e)  I  =  e^ 92iqi2^) ■ 

\  1=3  I  (=3 

For  sufficiently  small  e,  let  |"n(e)  —  u,-,-|  >  -|"n  —  "i«|,   i  =  3,5,...,n.    Then  the 
terms  on  the  right  of  the  above  equations  will  be  of  order  e2,  and  we  have 


"11(e)  -  \uu  +  egn  +  £^«il?ii(0) 


/=3 


(12(e)  -     U12  +  eg12  +  e  J2  uuq'i2W 


/=3 


"ll(e)  -       Un  +  ££f22  +  eE"2'?l2(°) 


/=3 


=  0(e2), 

=  0(e2), 
=  0(e2). 


Thus  perturbations  in  uu,  arid  t/i2  are  of  order  e  as  e  — >  0.  Due  to  Theorem  2.1.1,  it 
is  always  possible  to  find  a  sufficiently  small  e,  such  that  \ujj(c)  —  u„|  >  —  |"jj  —  "j«| 
for  i  ^  j,  i,j  =  1,3,5, ...  ,n.  So  after  simplifying  (2.35),  we  can  show  that  for 
i  <j,  j  =  3,  5,...,  n, 

"ij(e)-  \uij  +  egij+   j]   ",7%(e)  -  X)fti(e)"ii     =  °(e2)- 
V  ;=j+i  i=i  / 
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Hence 


■  -1 


/=i 


0(62). 


Thus  the  perturbation  in  Uij  is  of  order  e  as  e  — >  0.  Hence  c/(e),  and  5(e)  are 
different iable  functions  of  e,  and  their  first  derivatives  are  continuous.  This  completes 
the  proof  of  the  theorem.  □ 

Now  we  are  ready  to  derive  the  necessary  equations  to  find  a  block  Schur  decom- 
position A  =  SUSH .  Given  an  n  x  n  complex  matrix  E,  and  for  a  sufficiently  small 
e,  let  5(e),  and  U(e)  be  continuously  differentiable  functions  of  e  such  that  5(0)  =  5, 
and  £/(0)  =  U .  Differentiating  (2.34)  with  respect  to  e  and  then  putting  e  =  0.  we 
obtain 

AS'(O)  +  ES  =  S'(Q)U  +  SU'(O) 

SHAS'(0)  +  SHES  =  SHS'(0)U  +  U'(0).  (2.36) 

Since  S"A  =  US",  so  (2.36)  gives 


US"S'{0)  +  SHES  =  S"S'(0)U  +  U'{0). 


Taking  the  jth  column  block  on  both  sides  of  (2.37),  we  have 

USHS'J(0)  +  S"ESj  =  SHS'(0)Uj  +  C/j(0). 
Next  consider  the  following  expressions  for  5j(e): 

Si(e)  =  '£SiBii(e), 


(2.37) 


(2.3S) 


(2.39) 


i=i 


where  B,j{t)  is  an  in,  x  m :  matrix.   We  normalize  (2.39)  by  taking  Bjj(t)  =  I,  and 

Bij(t)  =  0  for  i  <  j.   Differentiating  (2.39)  with  respect  to  e  and  then  setting  t  =  0. 

we  get 

>;„))=    £  5,/r(0).  (2.40) 

.=J+1 
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Since  S  is  unitary,  so 


Using  (2.40),  and  (2.41)  we  get  SHS'(0)  =  B'{0).  So  (2.38)  reduces  to 

UB'jiO)  +  SHESJ  =  B'iOWj  +  t/;(o), 

where  £■(())  is  the  jth  column  block  of  B'{0).  Hence  for  each  j,  we  have  the  following 
two  matrix  equations: 

J2  UuB'^Q)  +  S^ESj  =  £  B'lt(0)Uu  +  U'^0)      for  i  <  j,   j  =  1,2, . . . ,  k. 


i=j+i 


/=i 


2  %BJi(0)  +  Sf  ESj  =  £  ^(0)t/o   for  i  >  j,  j  =  1, . . . ,  k  -  1. 


l=i 


1=1 


The  last  two  equations  can  be  rewritten  as: 


t-i 


DJ-(O)  =  Sf  JStf ;  +   £   %^(0)  -  £  ^,(0)^     for  i  <i,  j  =  1,2,...,  fc.    (2.42) 


/=?+i 


;=i 


k  3-1 

B'nWii  -  UuBljiQ)  =  S?ES,  +  £  UuB'^O)  -  YPMVii  far  •>  j,  j  =  1,  ...,*- 1. 

(2.43) 

Solution  of  (2.42)  is  dependent  on  the  solution  of  (2.43).  So  we  focus  on  how  to  find 
the  solution  of  (2.43).  From  the  right  hand  side  of  (2.43),  it  is  clear  that  to  find  B'^O), 
we  need  entries  below  5-^(0)  on  the  jth  block  column;  namely  B'i+lj(0), . . . ,  B'kj(0), 
and  entries  on  the  left  of  5^(0)  on  the  ith  block  row;  namely  5^(0), . . . ,  B'-^O). 
Pictorially  these  can  be  described  with  zero  suppressed  in  B^O)  as: 


Bh 


B'ij-i 


i    Bi+lj 

I        \ 
I      B'ki 


56 


That  is  to  determine  B'{j,  we  need  entries  just  above  the  arrowheads  directing  towards 
the  left,  and  just  on  the  right  of  the  downwards  arrowheads. 

Here  we  will  give  a  sketch  to  find  B\}.    Let  C  =  S^ESj.    If  i  <  k,  then  C  = 

C+  ^2   Uil-Blji  and  'O  >  U  tnen  C  —  ^  —  $Z  B'uUij.  Next  we  need  to  solve  the  matrix 

equation  B'^Ujj  —  UitB'tJ  =  C  for  B'-.  To  solve  this  equation,  let  P  =  B[y  F  =  Ujj, 

and  G  =  Ui,.  Then  it  becomes  GP  —  PF  =  /?,  where  R  =  —  C.  Since  F,  and  G  are 

upper  triangular  matrices,  so  using  the  following  method  (  detail  is  in  [Gol90,  page 

387]),  we  can  solve  the  matrix  equation  GP  —  PF  =  R  for  P.  Let  R  =  [rj,  ?-2, . .  .  ,rmJ, 

and  P  =  [pi,P2i  ■  ■  ■  iPnij  ]  be  column  partitionings,  then  solve 

/-i 

(G  -  fill)  pi   =   r,  +    J2   fmlPm 
m  =  \ 

for  pi.  I  =  l....,m.r  Once  we  obtain  P,  then  B'tJ  =  P  for  i  >  j,  jf  =  1,2 A-  —  1. 

2.6     An  Algorithm  for  Block  Schur  Decomposition  of  a  Matrix 

In  Section  2.5,  we  derived  the  necessary  equations  to  find  a  block  Schur  decom- 
position of  a  matrix.  Here  we  plan  to  develop  an  algorithm  using  Armijo's  rule  from 
optimization  theory  to  find  a  block  Schur  decomposition  of  a  matrix. 

Let  A  be  an  ??  x  n  complex  matrix.  Suppose  .4  =  SUS  is  a  block  Schur  de- 
composition of  .4,  where  V  is  a  k  x  k  block  upper  triangular  matrix  such  that  (',,. 

and  Ujj  for  i  ^  j  have  distinct  eigenvalues,  and  S  =  [Si,  52, . . .  ,5*  ]  is  a  compatible 

j 
block  column  unitary  matrix  such  that  ASj  =  )] SjUjj.  Given  an  arbitrary  complex 

1=1 
mat  1  ix  E,  and  for  a  sufficienl  ly  small  e,  let  5(e),  and  (!{t)  be  analytic  functions  of  e, 

such  that   5(0)  =  5,  t'(0)  =  T.  and 

(A  +  eE)S(e)  =  S(e)U(e).  (2.11) 

A  first  order  Taylor  Expansion  gives 

5(e)    *    5(0)  +  e5'(0),  ,„ 

[/(e)    =    /(())  +  «/'(0).  K     ' 
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To  develop  an  algorithm,  suppose  SUSH  is  an  approximate  block  Schur  decompo- 
sition of  B.  We  identify  A,  and  E  in  (2.44)  with  SUSH,  and  B  -  SUS"  respectively. 
Let  S'(0)  =  SG(S,U),  where  Gi-j(S,U)  is  an  mt  x  m.j  zero  matrix  for  i  <  j,  and  for 
i  >  j,  Gij(S,  U)  is  the  solution  P  to  the  matrix  equation  PUjj  —  UaP  =  C,  where 

C  =  Sf  (B  -  SUS")  Sj  +  J2  UuGii{S,  U)  -  J2  Ga(S,  U)UU. 

l=l+\  1=1 

From  (2.37),  we  have 


USHS'(0)  +  SHES  =  SHS'(0)U  +  l/'(0). 

Next  substituting  the  values  of  E,  and  5'(0)  in  the  above  equation  and  then  simpli- 
fying, we  obtain 

U'(0)  =  UG(S,  U)  +  SHBS  -  U  -  G{S,  U)U. 

Now  substituting  the  values  of  c/'(0),  and  S'(0)  into  (2.45)  and  suppressing  zeros  in 
5(0),  and  U(0),  we  get 

S(e)    «    S{I  +  eG(S,U)), 


(2.46) 


U(t)    »    U  +  t  (UG{S,  U)  +  SHBS  -  U  -  G(S,  U)U)  . 
Taking  e  =  1  in  (2.46),  we  obtain 

gneu,       =      gold  ^  +  Q  ^gold^  jjold^  j  ^ 

jjnew       _       jjold/~i  I  qold     rrold\   /  qold\         r>  cold  /~i  f  cold    j  rold\   r  jold 

If  the  starting  guess  unitary  matrix  So  is  not  a  good  approximation  of  Q,  where 
B  =  QRQH  is  a  Schur  decomposition  of  B,  then  in  the  update  Snew  =  S0,d(I+ 
G(SoW,{/oW)),  the  increment  SoldG  (Soid,  UM)  may  be  too  large.  However  with 
a  large  increment,  we  may  likely  have  instabilitiy  in  the  updating  process,  and  the 
algorithm  may  diverge.  To  restore  the  convergence  of  the  algorithm,  and  to  have  a 
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steady  change  in  the  values  of  Snew ,  and  Unew,  we  need  a  small  increment  in  each 
iteration.  To  achieve  this,  we  redefine  each  iterate,  and  introduce  a  small  positive 
parameter.  Let  f  be  a  positive  parameter;  define  S(t)  =  Sold  (/  +  tG  (Sold,Uold^, 
and  U(t)  =  Uold  +  t  (uo,dG  (Sold,U°'d)  +  (sold)H  BSold  -  Uold  -  G  (s°'d,Uold)  U°'A. 
Then  5(0)  =  Sold,  5(1)  =  Snew,  U(0)  =  Uold,  and  U(l)  =  Unew .  Define  /(f)  = 
||Z(f)||F,  where  Z(t)  =B-  S(t)U(t)S(t)-\  Then  /(0)  =  \\B  -  5(0)t/(0)5(0)//||F, 
and  /(l)  =  \\B  —  S(1)U(\)S{\)~1\\f-  When  the  starting  guess  50,  and  U0  are  good 
approximations  of  Q,  and  R  in  the  factorization  B  =  QRQ" ,  then  we  must  have 
/(l)  <  /(0).  As  earlier,  here  our  goal  is  to  find  a  /,  0  <  r  <  1,  for  which  f(t)  <  /(0) 
holds.  To  this  end,  we  use  Armijo's  rule  from  optimization  theory.  In  Armijo's  rule, 
we  determine  t  in  the  following  way.  Evaluate  f(t)  at  t  =  1,|,  |, . . .,  stopping  when 


/(o<(i-£)/(0). 


As  in  Section  2.3,  to  use  the  above  rule,  we  must  have  /'(0)  =  — /(0).   So  our  next 

aim  is  to  determine  /'(0),  when  f{t)  =  \\Z(t)\\F,  and  Z(t)  =  B  -  S(t)U(t)S(t)~l. 

Suppressing  the  superscripts  of  Sold,  and  Uold  in  the  definitions  of  S(t),  and  U(t)  we 

get 

S(t)    =    S(I  +  tG(S,U)), 

U{t)    =    U  +  t(UG{S,U)  +  S"BS-U  -G(S,U)U).  (       ' 

With  the  above  definition  of  /(f).  we  will  show  in  the  next  theorem  that  /'(0)  = 

-/(o). 

Theorem  2.6.1  Let  A  €  Cn> "  «//</  suppose  A  =  SUS  is  a  block  Schur  decomposition 
of  A,  what  V  is  a  k  X  k  block  upper  triangular  matrix  such  that  (',,.  and  Ujj  for 
i  ^  j  have  distinct  eigt  nvalues,  and  S  =  [Si,  S-2, .  ■  ■ ,  S*  ]  is  a  compatibh  block  column 
unitary  mains.  If  w,  defint  U(t),  S(t)  as  in  (J. \1).  Z(t)  =  A-S(t)U(t)S(t)'1,  and 
/(/)  =  ||Z(f)||,-.  then  /'(0)  =  -/(0). 
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Proof  of  Theorem  2.6.1  The  result  (2.18)  of  Lemma  2.3.1  gives 

f(0)ftf(t)\t=o  =  trace  (z(0)"^Z(t)|.=o)  .  (2.48) 

Differentiating  the  expression  for  Z{t)  with  respect  to  t,  and  then  evaluating  the 
derivative  at  t  =  0,  we  have 

jtz(t)\t=0  =  -p(t)\t=0u(o)S(oyl  -  sio^uit^oSioy1 

+S(0)U(0)S(0rljtS(t)\t=oS(0)-1.  (2.49) 

Next  differentiating  U{t),  and  S{t)  with  respect  to  t,  and  then  evaluating  the  deriva- 
tives at  t  =  0,  we  obtain 

jtU(t)\t=o  =  UG(S,  U)  +  SHAS  -U-  G(S,  U)U,      and     ^5(*)|t=o  =  SG(S,  U). 

Using  these  values  and  values  of  U{0),  and  5(0)  in  (2.49),  we  get 

4zm|t=o    =    -SG(S,U)USH-S(UG(S,U)  +  SHAS-U-G(S,U)U)SH 

at  v 

+sushsg{s,u)sh 

=  -(a-sush) 

=    -Z(0). 

Hence  —  Z(t)\t=0  =  -Z{0),  and  with  this  value  (2.48)  reduces  to 
dt 


'  dt' 

=  -/(o)2. 


/(0)4/(<)U=o     =     trace  (Z(0)H  (-Z(O))) 


So  4/(*)l*=o  =  -/(0),  provided  /(0)  ^  0.  □ 

at 

Thus  Armijo's  rule  can  be  applied  to  find  a  block  Schur  decomposition  of  a  matrix 
A  and  we  will  compute  a  block  Schur  decomposition  of  A  as  follows: 
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Algorithm  2.6.1  (Block  Schur  Decomposition)  Given  A  £  C"xn,  a  tolerance  tol 
greater  than  the  unit  roundoff,  a  coalescing  tolerance  toll  greater  than  the  square 
root  of  the  unit  roundoff,  an  upper  triangular  matrix  U0,  and  a  unitary  matrix  50,  the 
following  algorithm  uses  (2.47)  to  compute  a  block  Schur  decomposition  A  =  SUSH . 

Define  tol2  =  lOOtoll .  We  break  the  sketch  of  the  algorithm  into  several  steps. 

Step  0:  Take  S  =  5o,  U  =  Uq.  Let  NB  denote  the  number  of  diagonal  blocks  of 
U,  and  let  NS  denote  an  array  of  NB  elements  whose  ith  element  is  the  dimension 
of  the  ith  diagonal  block  of  U. 

Step  1:  Let  m  =  NB.  For  i  <  j,  define  atJ  =  min  {  \a  -  w|  :  a  £  \{l'„).  and 
w  £  -Mc/jj)  }.  If  ctij  <  toll,  then  merge  blocks  Uu  and  UJ:  to  form  a  single  block  using 
a  unitary  transformation  to  U .  Postmultiply  5  by  the  same  unitary  transformation. 
Update  NBold,  and  NSold  to  get  NBnew ,  and  NSnew .  Try  the  above  coupling  procedure 
for  all  possible  combinations  of  1  <  t  <  j  <  m.  Using  a  unitary  transformation 
arrange  diagonal  blocks  of  U  in  the  decreasing  order  of  sizes.  Postmultiply  S  by  the 
same  unitary  transformation. 

Step  2:  Construct  G(S,U)  as  follows: 
For  i  <  j,  take  Gij(S,  (')  =  0,  which  is  an  /7i,xm_,  zero  matrix,  and  for  i  >  j,  GtJ(S,  U) 
can  be  determined  by  using  the  following  loop. 

B  =  AS-SU 

for  ;  =  1  :  NB  -  1 

fori  =  NB  :  -1  :j '  +  1 
C  =  S[IBj 
if  /  <  NB 

C  =  C+  ^  UikGkj(S,U) 

k-i+i 

end 
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if  J  >  1 

j'-i 
C  =  C-Y,Gik(S,U)Uki 

k=\ 
end 

Solve  PUjj  -  UuP  =  C  for  P 

GtJ(S,U)  =  P 
end 
end 

Step  3:  Let  f(t)  =  \\Z(t)\\F,  where  Z(t)  =  A  -  S{t)U{t)S{t)-\  S{t)  =  S(I 
+  tG{S,U)),  and  U{t)  =  U  +  t  (UG(S,U)  +  SH  AS  -  U  -  G(S,U)U).  Evaluate /(t) 
at  t  =  1,|,^,|,...,  stopping  when 

/(*)<  (l  -  |)  /(0).  (2.50) 

Let  p  be  the  first  value  of  t  for  which  (2.50)  is  true. 

Step  4:  Suppose  fi  —  [  £7m+lm+1, . . . ,  Unbnb  ]T ,  where  0  <  m  <  NB  with  dim(c/t!) 
>    1  for  i   =   1,2,  ...,m,  and  dim(f/,-,-)   =   1  for  i  =  m  +  I,. . .  ,NB.     Define  d  = 

m 

diag  (UG(S,U)  +  SH AS  -U -G(S,U)U)  (t   :   72),  where  *    =    l  +  £WS(i),   and 

1=1 

diag(M)  (i  :  n)  is  a  vector  formed  from  t  through  n  elements  of  the  vector  diag(Af). 

Our  aim  is  to  find  two  indices  il,  and  iu  as  follows.    Let  k  be  an  array  with  the 

property  that  \/ik{i)\  <  |/<it(i+i)|-  Next  form  the  ratio: 

r  _  fikji+i)  ~  fik(i) 
dk(i)  —  dk(i+i) 

and  let  |r,0|  =  min{  \r,\  :  i  —  1,2, .. .  ,n  —  t,    Re(r,)  >  0  &   |r,-|  <  1  }.    Let  il  =  t— 

1+  min{  fc(i0),  fc(e'o  +  1)  },  and  iu  =  £  —  1+  max{  A-(i0),  fc(i0  +  1)  }• 

Step  5:  Update  U,  and  S  as  follows: 

c/"6"'    =    Uold  +  P  (uMG  (sold,  Uold)  +  (Sold) "  ASold 
-UoU  -  G  (sold,Uold)  Uold)  , 
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Snew    =    Sold[l  +  pG(Sold,Uold)) 


Let  Z  =  SnewUnew  (Snew)~l. 

Step  6:  Consider  the  block  U}]  with  1  <  j  <  m,  where  m  is  defined  as  in  Step 
4.  Let  Ujj  —  QRQH  be  a  Schur  decomposition  of  Ujj.  Find  a  unitary  matrix  P  such 
that  P"RP  =  V  =  D+  N,  where  D  =  diag(  V'n,  V22,  ...,  Vkk  ),  Ar  is  the  strictly 
upper  triangular  part  of  V,  lc  <  nij,  and  min{  \a  —  u>\  :  <7  G  A(\^),  and  w  €  A(V//)  } 
>  to/^.  Replace  ^  by  V7,  and  5j  by  5jQP.  Update  NBold,  and  .V5^w  to  get  NBnew, 
and  NSnew.  Try  the  above  decoupling  procedure  for  all  j  such  that  1  <  j  <  in. 

Step  7:  Create  a  2  x  2  diagonal  block  by  merging  Uu,n  and  c/1Ui,u  by  a  unitary 
transformation.  Postmultiply  S  by  the  same  unitary  transformation. 

Step  8:  Let  SM  =  QR  be  a  QR-factorization  of  Sold.  Then  take  Snew  =  Q,  and 
t/"™  =  QHZQ,  where  Z  is  the  matrix  from  Step  5. 

Step  9:  Compute  /(0)  =  \\A  -  Z\\F  and  goto  Step  1  until  /(0)  <  /o/. 

Example  2.6.1  If  we  apply  the  real  version  of  Algorithm  2.6.1  to  the  matrix  A  in 
Example  2.3.1  with  U0  =  diag  (  0,  -2,  0  ),  Su  =  /.  tol  =  KT6.  and  toll  =  1(T\  then 
after  9  iterations  the  upper  triangular  matrix  Uq  converges  to 


u 


-0.88991       2.72.-)')7       L.63393 

-0.73813    -1.11009    -1.82490 

0  0  0 


The  eigenvalues  of  U  are  A  =  [  -1  +  1.11  121/'.  -1  —  1.41421/,  0  ].  If  we  use  the 
QK  method  with  double  implicit  shift  to  .1  with  the  same  tolerance,  then  after  5 
iterations  we  gel  a  real  Schur  decomposition  A  =  QRQ1 .  where 


R  = 


0     1.7321  0 

-1.7321  -2    2.4495 

0  0  0 


.    and   Q 


0.8165  0     -0.")771 

I).  1I)M'     0.7071     -0.5771 
0,1082    0.7071        0.5774 
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Next  consider  the  following  matrix  Ai,  which  we  obtain  by  perturbing  the  elements 
of  A: 

A1  = 


0.001  1.002  1.001 
-2.002  -2.003  -1.001 
-2.002    -1.002       0.001 


If  we  apply  the  real  version  of  Algorithm  2.6.1  to  Ax  with  So  =  Q,  U0  =  R,  the 
tolerance,  and  the  coalescing  tolerance  are  the  same,  then  after  1  iteration  we  get 
a  block  Schur  decomposition  PTA1P  =  fJ,  whereas  if  we  use  the  QR  method  with 
double  implicit  shift  to  QTA1Q,  then  after  5  iterations  we  obtain  a  real  Schur  form 
VT(QTAlQ)V  =  r. 

Example  2.6.2  If  the  real  version  of  Algorithm  2.6.1  is  applied  to  the  matrix  A  in 
Example  2.3.2  with  U0  =  diag  (  -2,  1,  -1  ),  S0  =  I,  the  tolerance,  and  the  coalescing 
tolerance  are  as  in  Example  2.6.1,  then  after  10  iterations  the  upper  triangular  matrix 
Uq  converges  to 


U 


1.00104     -3.13124    -1.64159 
0  0    -1.22496 

0  0    -0.99896 


If  we  use  the  QR  method  with  double  implicit  shift  to  A  with  the  same  tolerance, 
then  after  6  iterations  we  obtain  a  real  Schur  form: 


QTAQ 


-1.04199     -3.46359     -0.69221 

0.00051     -0.95802       1.23323 

0  0  0 


Example  2.6.3  Consider  the  following  matrices  A,  and  P: 


0  0  0  0  0  0  0  -80 

10  0  0  0  0  0  236 

0  10  0  0  0  0  -164 

0  0  10  0  0  0  -141 

0  0  0  10  0  0  243 

0  0  0  0  10  0  -102 

0  0  0  0  0  10  2 

0  0  0  0  0  0  1  7 
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If  we  apply  the  QR  method  with  double  implicit  shift  to  A  with  tol  =  10_b.  then 
after  15  iterations  we  get  a  real  Schui  form  ST  AS  =  U.  The  eigenvalues  of  U  are 
A  =  [-4,  -1,  2,  2,  5,  1,  1,  1].  Now  if  the  real  version  of  Algorithm  2.6.1  is  applied  to 
the  perturbed  matrix  Aj  =  A+  P/5000  with  U0  =  U,  S0  =  S,  toll  =  \0~\  and  the 
tolerance  is  as  above,  then  after  3  iterations  we  obtain  a  block  Schur  decomposition 
QTAiQ  =  ft,  whereas  if  we  apply  the  QR  method  with  double  implicit  shift  to 
STAlS,  then  after  10  iterations  we  obtain  a  real  Schur  form  VT{STA1S)V  =  T. 

Although  the  real  version  of  Algorithm  2.6.1  works  for  all  matrices  A  with  a  diag- 
onal matrix  formed  from  the  diagonal  elements  of  .4,  as  the  starting  upper  triangular 
matrix  U0,  and  So  =  /.  it  has  some  disadvantages.    For  example,  if  the  size  of  the 
matrix  is  large,  then  we  need  to  take  a  very  large  coalescing  tolerance  compared  to 
toll  =  1CT4  through  several  iterations  at  the  beginning.   However  with  a  large  coa- 
lescing tolerance,  the  sizes  of  some  diagonal  blocks  of  U  become  large  compared  to  the 
size  of  A,  which  undermines  the  main  goal  of  this  algorithm;  namely  keep  the  size  of 
the  diagonal  blocks  of  U  as  small  as  possible.  In  Example  2.6.1,  and  2.6.3  we  noticed 
that  if  by  another  method,  a  real  Schur  form  .4  =  SUST  can  be  computed,  then  we 
can  use  U  as  the  starting  upper  triangular  matrix,  and  S  as  the  starting  orthogon.il 
matrix  to  obtain  a  block  Schur  decomposition  of  a  perturbed  matrix  A  +  tE  with 
very  rapid  convergence,  where  E  is  an  arbitrary  matrix,  and  e  is  a  small  scalar.    In 
the  following  table  for  different  values  of  n  we  give  the  number  of  iterations  required 
to  obtain  block  Schur  decompositions  of  A+eE  by  the  real  version  of  Algorithm  2.6.1 
with  f/0  =  U,  and  S0  =  S,  and  real  Schur  forms  of  ST(A  +  eE)S  by  the  QR  method 
with  double  implicit  shift,  where  ST AS  =  U  is  a  real  Schur  form  of  the  matrix  A. 
Here  the  original  matrices  A  =  (oy)  are  random  matrices,  where  <;,,  are  integers  and 
0  <  a,j  <   10,  the  matrices  E  =  (ey)  are  random  matrices,  where  < ,,  are  reals  and 
0  <  eij  <  1,  the  tolerance-  tol  =  10_ti,  and  the  coalescing  tolerance  toll  =  10-4: 
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size 

number  of  iterations 
for  our  algorithm 

number  of  iterations  for 
the  QR  algorithm  with  double  shift 

c 

A  +  eE 

A  +  eE 

ST(A  +  eE)S 

10 

2 

15 

19 

io-2 

20 

2 

27 

28 

io-2 

30 

2 

42 

45 

io-2 

40 

2 

53 

56 

io-2 

50 

2 

79 

70 

IO"2 

Example  2.6.4  If  Algorithm  2.6.1  is  applied  to  the  matrix 


A  = 


4 

-5 

0 

3 

_2 

1 

0 

-4 

-3 

-5 

-1 

2 

5 

4 

-3 

0 

5 

-3 

3 

0 

5 

4 

2 

-1 

6 

1 

3 

-4 

1 

_2 

2 

-3 

-1 

1 

5 

-3 

with  U0  =  diag(  19,  9.5  +  16.45448?,  -9.5  +  16.45448?',  -19,  -9.5  -  16.45448*,  9.5- 
16.45448?'),  where  the  elements  of  Uo  are  the  uniformly  distributed  points  on  a  circle, 
which  includes  all  Gerschgorin  disks  for  A,  So  =  /,  tol  =  10-6,  and  toll  —  10-4, 
then  after  13  iterations  we  obtain  a  block  Schur  decomposition  SH AS  =  U,  where 
U  =  D  +  N,  D  =  diag(  -2.25911  -  3.43367?,  1.83571  -  3.95888?,  -2.25911  + 
3.43367i,  -7.52532,  1.83571  +  3.95888i,  7.37213  ),  and  N  is  the  strictly  upper  tri- 
angular part  of  U.  Figure  2.1  shows  how  the  initial  eigenvalues  of  Uo  converged  to 
the  eigenvalues  of  A.  If  we  use  the  QR  method  with  double  implicit  shift  to  A  with 
the  above  tolerance,  then  after  9  iterations  we  obtain  a  real  Schur  form  QTAQ  =  R. 
Next  consider  the  following  matrix  A\,  which  we  get  by  perturbing  the  elements  of 
A: 


A, 


4.0022  -4.9949 

0.0068  -3.9940 

5.0091  4.0082 

3.0025  0.0076 

-5.9914  1.0046 

2.0047  -2.9905 


0.0063  3.0095  -1.9959  1.0072 

-2.9956  -4.9915  -0.9942  2.0080 

-2.9918  0.0029  5.0088  -2.9929 

5.0069  4.0054  2.0044  -0.9926 

3.0070  -3.9949  1.0073  -1.9998 
-0.9901  1.0010  5.0087  -2.9911 
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-15  -10 


15  20 


Figure  2.1.  The  path  of  convergence  of  the  eigenvalues  of  the  matrix  A. 

If  we  use  Algorithm  2.6.1  to  A\  with  Uo  =  U,  S0  =  5,  the  tolerance,  and  the  coalescing 
tolerance  are  as  above,  then  after  4  iterations  we  get  a  block  Schur  decomposition 
PH A\P  =  fi,  whereas  if  we  apply  the  QR  method  with  double  implicit  shift  to 
Q7 A\Q  with  the  same  tolerance,  then  after  8  iterations  we  obtain  a  real  Schur  form 
VT(QTAXQ)V  =  r. 

Examph  2.6.5  If  Algorithm  2.6.1  is  applied  to  the  matrices 


D 


B 


'000 

-10 

"  0 

0 

0    -1  " 

= 

1     0    0 
0     1     0 

18 
-15 

,  c  = 

1     0 
0     1 

0        1 
0     -6 
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with  diagonal  matrices,  A  =  diag(  19,  3  +  16/,  —13,  3  —  16/  ),  0  =  diag(  7,  2  + 
5i,  -3,  2  -  5z  ),  S  =  diag(  28,  -2  +  30/,  -32,  -2  -  30/  ),  and  fi  =  diag(  1,  0.5  + 
0.8660*',  -0.5  +  0.8660*,  -1,  -0.5  -  0.8660/,  0.5-0.8660/)  as  U0,  where  the  elements 
of  A,  0,  £,  and  Q  are  the  uniformly  distributed  points  on  circles,  which  include  all 
Gerschgorin  disks  for  B,  C,  D,  and  E  respectively,  So  =  1 ,  tol  =  10-6,  and  toll  = 
10-4,  then  after  16,  112,  12,  and  11  iterations  we  obtain  block  Schur  decompositions 
of  B,  C,  D,  and  E  respectively.  If  we  use  the  QR  method  with  double  implicit 
shift,  then  after  6,  14,  and  1  iterations  we  obtain  real  Schur  forms  of  B,  C,  and  D 
respectively,  whereas  E  diverges. 

Example  2.6.6  If  we  apply  Algorithm  2.6.1  to 

A  =  Xdiag(  1,  1,  1,  1,  2,  2,  2,  3,  3,  0,  -1,  -1,  -1,  -1,  -1,  -5,  -5,  4,  5,  -4  )X~\ 

where  X  =  Pdiag(  1,  2,  3,  4,  5,  6,  7,  8,  9,  10,-1,  -2,  -3,  -4,  -5,  -6,  -7,  -8,  -9, 
-10  )Q,  P,  and  Q  are  orthogonal  matrices  with  U0  —  diag(  43.6939,  41.5811  + 
13.3395/,  35.4496+25.3733/,  25.8996+34.9233/,  13.8658+41.0548/,  0.5263+43.1676/, 
-12.8132  +  41.0548/,  -24.8470  +  34.9233/,  -34.3970  +  25.3733/,  -40.5285  +  13.3395/, 
-42.6413,  -40.5285-13.3395*,  -34.3970-25.3733*,  -24.8470-34.9233*,  -12. 8132  — 
41.0548*,  0.5263-43.1676*,  13. 8658-41. 0548/,  25.8996-34.9233/,  35.4496-25.3733/, 
41.5811  —  13.3395/),  where  the  elementsof  Uq  are  the  uniformly  distributed  points  on  a 
circle,  which  includes  all  Gerschgorin  disks  for  A,  So  =  I,  tol  —  10-6,  and  toll  =  10-4, 
then  after  29  iterations  we  obtain  a  block  Schur  decomposition  S  AS  =  U,  where 
U  =  D  +  N,  D  =  diag(  -1,  -1,  -1,  -1,  -1,  1,  1,  1,  1,  2,  2,  2,  3,  3,  -5,  -5,  0,  4, 
—4,  5  ),  and  A^  is  the  strictly  upper  triangular  part  of  U.  If  we  use  the  QR  method 
with  double  implicit  shift  to  A,  then  after  18  iterations  we  get  a  real  Schur  form 
QTAQ  =  R.  Next  consider  the  perturbed  matrix  Ax  -  A  +  P/1000,  where  P  is  the 
matrix  from  Example  2.1.3.  If  we  apply  Algorithm  2.6.1  to  Ai  with  Uq  =  U,  So  =  S, 
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the  tolerance,  and  the  coalescing  tolerance  are  the  same,  then  after  4  iterations  we 
obtain  a  block  Schur  form  PH A\P  =  $7,  whereas  if  we  apply  the  QR  method  with 
double  implicit  shift  to  QTAtQ,  then  after  19  iterations  we  obtain  a  real  Schur  form 
VT(QTA1Q)V  =  T. 

Example  2.6.7  If  Algorithm  2.6.1  is  applied  to  the  matrix  C  in  Example  2.1.3  with 
r/o=diag(139.50,132.6969+42.9537s,112.9537+81.7022s,  82.2022+112.4537s,  43.4537 
+132.1969*',  0.50  +  139.00i,  -42.4537  +  132.1969s,  -81.2022  +  112.4537i,  -111.9537  + 

81.7022/,  -131.6969  +  42.9537i,  -138.50,  -131. 6969-42.9537*',  -111.9537-81. 7022?', 
-81. 2022-  112.4537i',-42.4537- 132. 1969?,  0.50- 139.0?',  43.4537- 132. 1969*",  82.2022 
-112.4537?.  112.9537  -  81.7022s,  132.6969  -  42.9537?  ),  where  the  elements  of  U0  are 
the  uniformly  distributed  points  on  a  circle,  which  includes  all  Gerschgorin  disks  for 
C,  So  =  /,  fol  =  10-6,  and  toll  =  10-4,  then  after  26  iterations  we  obtain  a  block 
Schur  form  SHCS  —  U.  Figure  2.2  shows  how  the  initial  eigenvalues  of  U0  converged 
to  the  eigenvalues  of  C.  If  we  use  the  QR  method  with  double  implicit  shift  to  C, 
then  after  27  iterations  we  obtain  a  real  Schur  form  QTCQ  =  R.  Next  consider  the 
perturbed  matrix  C\  =  C  +  f/1000,  where  P  is  the  matrix  from  Example  2.1.3. 
If  we  apply  Algorithm  2.6.1  to  C\  with  U0  =  U,  So  =  5,  the  tolerance,  and  the 
coalescing  tolerance  are  the  same,  then  after  2  iterations  we  obtain  a  block  Schur 
form  l'"(\P  =  J7,  whereas  if  we  apply  the  QR  method  with  double  implicit  shift  to 
Q7C\Q,  then  after  27  iterations  we  obtain  a  real  Schur  form  \'  (Q1  (\Q)\    =  1  . 

In  the  following  table  for  different  values  of  n  we  give  the  number  of  iterations 
required  to  obtain  block  Schur  decompositions  of  A  by  Algorithm  2.6.1  with  diagonal 
matrices  as  £/o,  where  the  elements  of  U0  are  the  uniformly  distributed  points  on 
circles,  which  include  all  Gerschgorin  disks  for  A,  S0  =  /.  and  real  Schur  forms  of  A 
by  the  QR  method  with  double  implicil  shift.  Here  A  =  (</,,)  are  random  matrices, 
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Figure  2.2.  The  path  of  convergence  of  the  eigenvalues  of  C. 

where  atJ  are  real  numbers  and  0  <  atJ   <   1,  the  tolerance  tol  =   10-6,  and  the 
coalescing  tolerance  toll  =  10-4: 


size 

number  of  iterations 
for  our  algorithm 

number  of  iterations  for 
the  QR  algorithm  with  double  shift 

20 

25 

27 

40 

11 

58 

60 

64 

79 

80 

75 

110 

100 

80 

134 

120 

93 

178 

140 

10:5 

193 

Algorithm  2.6.1  can  be  applied  to  all  matrices  A  with  So  =  /,  and  diagonal  ma- 
trices as  (To,  where  the  elements  of  Uq  are  the  uniformly  distributed  points  on  circles, 
which  include  all  Gerschgorin  disks  for  A.  Usually  the  stepsize  in  the  first  iteration 
becomes  1.  If  we  use  this  stepsize,  then  after  one  iteration  all  diagonal  elements  of 
the  resulting  upper  triangular  matrix  become  real,  and  this  creates  problem  in  the 
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convergence.  To  prevent  it,  we  take  one  fourth  of  the  stepsize  from  iteration  1  to 
iteration  2  or  4.  If  the  size  of  the  matrix  is  very  large,  then  we  may  have  to  take  one 
fourth  of  the  stepsize  from  iteration  1  to  iteration  6  or  10.  The  reason  of  doing  this  is 
the  following.  If  all  diagonal  entries  of  the  upper  triangular  matrix  at  the  beginning 
become  real,  then  the  approximate  eigenvalues  are  clustered  on  a  segment  of  the 
real  line.  When  two  elements  are  close  to  each  other,  but  not  within  the  coalescing 
tolerance,  then  in  the  updating  process  we  are  dividing  by  a  small  number  (  which 
is  their  difference  ).  If  this  happens  in  almost  all  iterations  at  the  beginning,  which 
is  the  case  when  the  size  of  a  matrix  is  greater  than  10,  then  it  brings  instability  in 
the  updating  process.  In  fact  for  that  reason  the  real  version  of  Algorithm  2.6.1  does 
not  work  very  well,  when  a  diagonal  matrix  formed  from  the  diagonal  elements  of 
the  given  matrix  is  used  as  the  starting  upper  triangular  matrix. 

2.7     Parallel  Processing  in  Eigencomputations 

We  already  discussed  the  derivation  of  five  algorithms  in  the  previous  sections  to 
find  all  eigenvalues  and  corresponding  eigenvectors  of  a  matrix.  To  implement  each 
algorithm,  we  need  either  a  matrix  of  approximate  eigenvectors,  or  all  approximate 
eigenvalues,  or  both.  But  except  for  the  last  one,  we  have  not  been  able  to  come 
up  with  ways  how  to  choose  either  a  matrix  of  approximate  eigenvectors,  or  all 
approximate  eigenvalues  for  the  starting  guess.  So  in  this  section,  we  will  only  discuss 
the  implementation  of  the  last  algorithm  in  a  parallel  computer. 

Most  of  the  algorithms,  which  are  available  to  find  all  eigenvalues  of  an  unsym- 
metric  matrix  are  serial  algorithms,  for  example,  consider  the  famous  QR  algorithm. 
which  is  used  to  find  all  eigenvalues  of  an  u  X  n  unsymmet ric  matrix  A.  Do  make  it 
computationally  feasible  first,  .1  is  reduced  to  a  Hessenberg  form  //.  and  then  double 
implicit   shift    (    Francis  QK   Step   )  is  used  ill  each  iteration  to  obtain  a  real  Schnr 
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form.  The  algorithm,  which  reduces  an  n  x  n  matrix  A  to  a  Hessenberg  form  H,  and 
the  algorithm  which  uses  double  implicit  shift  to  H  are  both  serial  algorithms. 

The  motivation  behind  the  derivation  of  various  algorithms  in  this  chapter  is  to 
find  all  eigenvalues  of  an  unsymmetric  matrix  by  making  use  of  a  sensitivity  result 
for  eigenvalues  and  eigenvectors.  Then  try  to  determine  whether  an  algorithm  can  be 
implemented  in  a  parallel  computer.  We  already  mentioned  that  except  for  the  last 
algorithm,  all  other  algorithms  described  in  this  chapter  have  some  shortcomings.  So 
we  will  give  a  sketch,  how  and  where  to  modify  the  last  algorithm  such  that  it  can  be 
implemented  in  a  parallel  computer.  For  clarity,  we  will  assume  that  n  =  rp,  where 
p  is  the  number  of  processors.  We  will  also  use  the  notation  Proc(i)  to  denote  the 
ith  processor. 

In  Step  1,  with  p  processors,  we  can  accelerate  the  merging  of  diagonal  blocks  in 
the  following  way.  For  a  fixed  i,  1  <  i  <  NB,  and  NB  -  i  <  p,  we  can  find  a3  = 
min{  \a  —  u\  :  a  G  A  ([/,-,•),  and  u  €  \{U]3)  },  j  =  i  +  1,»  +  2, . . . ,  NB,  in  separate 
processors.  If  a3  <  toll ,  then  we  can  merge  Ujj  with  Uu  in  the  increasing  order  of 
j.  If  NB  —  i  >  p,  then  we  have  to  use  some  processors  more  than  once.  Proc(£)  will 
compute  a;,  where  I  £  £  +  i  :  p  :  NB.  Parallel  processors  can  also  be  used  to  arrange 
diagonal  blocks  of  U  in  the  decreasing  order  of  sizes.  Let  [x]  denote  the  smallest 
integer  such  that  x  <  [x]  <  x  +  1.    If  U  has  k  blocks  along  the  diagonal,  then  not 

more  than  — ^— swaps  among  the  diagonal  blocks  are  necessary  to  arrange  them 

fc(fc-l) 

in  the  decreasing  order  of  sizes.   In  a  single  processor,  we  have  to  execute 

times  to  finish  — ^— ; swaps.  But  in  a  parallel  computer  with  p  processors,  we  can 


use  m  processors,  where  m  =  p  if  p  < 

'k(k-l) 
swaps  by  executing  at  most 


[lb] 

\k] 

it 

— 

2 

2 

r   ■  ,   k{k-\) 
and  m  =    —    if    —     <  p  to  finish 


9 

+  1  swaps  in  each  processor.  For  example,  if 
2  m 

the  dim(jVS')  is  five  (  say  NS  =  [  esi  a2  a3  a4  ah  }T  with  ax  <  a2  <  a3  <  a4  <  a5  )  and 
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there  are  two  processors,  then  we  can  swap  the  elements  of  NS  in  the  following  way: 


O] 

a4 
0.5 


a  2 
a4 
ax 
as 

Q3 


(i  A 
a5 
a2 

(13 


05 

03 

a  2 


Here  we  use  one  processor  to  swap  blocks  enclosed  by  a  brace  on  the  right,  and  the 
other  processor  to  swap  blocks  enclosed  by  a  parenthesis  on  the  right.  Each  processor 
does  5  comparisons,  and  5  swaps. 

In  Step  2,  to  compute  the  strictly  lower  triangular  matrix  G(S.U),  we  can  use 
parallel  processors.  Suppose  A  =  [Ai, . . . ,  Ap],  S  =  [Si, . . . ,  Sp],  and  U  =  [t  1, . . . ,  f/p] 
are  column  partitionings  of  .4,  S,  and  £/  respectively,  where  each  block  column  has 

width  r.  To  compute  £  =  AS-SU,  first  we  find  D  =  SU,  and  then  find  B  =  AS-D. 

T  * 

Since  Uj  =   [0g.,...,0g,O,...,O]     ,  Ua  <E  Crxr,  so  Dk  =  SUk  =  £$!/#,  A-  = 

1, . . .  ,p.  Thus  each  D^  can  be  computed  in  a  separate  processor.    Since  U  is  upper 

triangular,  so  Dp  involves  much  more  work  than  D\.    To  load  balance,  we  assign 

Proc(£ )  the  computation  of 

D(:,t:p:n)  =  SU(:,t:p:n)  =  '£iS{:,l:p:n)U(l:p:n,(:p:n), 

1=1 

as  suggested  in  [Gol90,  page  291].  Next  with  Ss  =  [  Sj, . . . ,  Sj  ]T  ,    Sij  <E  Crxr, 
Bk  =  /l.s',  -  Dk  =  Y,  AjSjk  -Dk,    A-  =  1 p. 

So  p  processors  can  be  used  to  compute  B.  where  D  can  be  overwritten  by  B.  As 
we  mentioned  in  Section  2.5,  to  compute  GX}  we  need  entries  below  Gt]  on  the  jth 
block  column;   namely  G,+ij ,G.\'Bj-  and  entries  on  the  left   of  GtJ   on   the  ith 


block  row;   namely  G,\, . . .  ,G,j-i-    To  load  balance,  we  will  use  the  following  two 

NBl 

strategies  to  compute  I  he  elements  of  6'(>.  I   );  one  for  p  > 

NB] 


+  1,  and  the  o\  her 


for  p  < 


+  1. 
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Figure  2.3.  The  direction,  in  which  the  elements  of  G(S,  U)  are  computed,  when  the 
number  of  processors  are  greater  than  half  of  the  total  blocks. 


Suppose  p  > 


NB] 


+  1.  Then  we  use  Proc(l)  to  compute  Gnbj,  J  =  li---i 
NB  -  1  in  the  increasing  order  of  the  column  index  j,  and  the  other  p  -  1  processors 
to  compute  the  remaining  elements  of  G(£,  U)  in  the  decreasing  order  of  the  row 
index  i,  as  we  explain  below.  We  use  Proc(2)  to  compute  Gt\,  i  =  NB  —  1, ...  ,2, 
Proc(3)  to  compute  Gl2,  i  =  NB  -  1, . . . ,  3.  Continuing  the  assignment  in  this  way, 
we  use  Proc(p)  to  compute  Gtp-U  i  =  NB  -  1, . . .  ,p.  If  NB  -  1  >  p,  then  we 
will  continue  this  ascending  order  of  assignments  to  those  processors  except  Proc(l). 
That  means,  Proc(2)  to  compute  Gip,  i  =  NB  -  1,. . .  ,p  +  1,  Proc(3)  to  compute 
Gjp+i,  i  =  NB  -  1, . . .  ,p  +  2,  and  in  this  order  if  necessary  Proc(p)  to  compute 
Gi2P-2,  i  =  NB-  1, . . .  ,2p-  1.  Pictorially  it  can  be  presented  by  arrow  heads,  where 
the  elements  are  computed  towards  the  arrow  heads  as  in  Figure  2.3. 
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Figure  2.4.  The  direction,  in  which  the  elements  of  G{S,U)  are  computed,  when  the 
number  of  processors  are  less  than  half  of  the  total  blocks. 


Next,  suppose  p  < 


NB 


+  1.  This  time  we  will  assign  each  processor  to  compute 
the  elements  in  the  decreasing  order  of  the  row  index  i.  Proc(l)  to  compute  G?tl,  i  = 
NB,...,2,  Proc(2)  to  compute  G,2.   i  =  NB,...,3.    Assigning  the  columns  in  this 

order,  we  use  Pioc(/j)  to  compute  Gip,    i  =  NB p  +  1.     Next  we  will  repeal 

the  order  of  assignment  as  in  the  first  strategy.  Proc(l)  to  compute  G'ip+i.  i  - 
NB, . .  .,p+2,  Proc(2)  to  compute  G,p+2,  i  =  NB, . . .  ,p  +  3,  and  in  this  order  Proc(p) 
to  compute  G'l2p,  «  =  NB,...,2p  +  1.  If  necessary,  we  will  repeat  it.  Pictorially  it 
can  be  shown  by  arrow  heads,  where  the  elements  are  computed  towards  the  arrow 
heads  as  in  Figure  2. 1. 

In  both  strategies,  we  can  not  start  all  processors  simultaneously.    In  the  Rrsl 

strategy,  before  Proc(^).  ^  =  2 p.  computes  an  element  GtJ.  we  Deed  to  make 

sure  that  Gui  1=1 j  —  1  are  already  computed,  and  if  ?  =  NB  -  1.  then  make 

sure  that  Proc(l)  already  computed  Gnbj-  The  Dumber  of  flops  required  by  the  serial 
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algorithm  to  compute  G(S,  U)  is  a  complicated  function  of  the  block  sizes  of  U.    If 

5 
all  blocks  are  lxl  matrices,  then  the  algorithm  requires  —n    flops. 

In  Step  3,  to  evaluate  f(t)  for  different  values  of  £,  we  can  use  parallel  processors. 

To  compute  the  lower  triangular  matrix  L,  where  L{j  =  tGij  for  1  <  j '  <  i  <  NB, 

we  can  use  one  of  the  strategies  discussed  earlier  to  compute  G(S,  U),  depending  on 

the  number  of  processors  p.    Assign  Proc(£),  1   <  £  <  p,  to  take  Ljj  =  /,  where 

j  =  pa  +  %,  a  =  0, 1, . . . ,  and  j  <  NB.  We  can  use  the  ideas  discussed  in  the  analysis 

of  Step  2,  to  find  matrix  products.   Next  we  will  give  a  sketch  to  find  the  inverse  of 

a  matrix  in  a  distributed  memory  multiprocessor,  where  processors  are  set  up  in  a 

ring.  An  efficient  way  to  find  the  inverse  of  a  matrix  A  is  to  find  the  LU  factorization 

of  A  with  partial  pivoting  first.  Let  piv(l  :  n  —  1)  —  0  be  a  zero  vector.  Consider  the 

following  node  program,  which  is  due  to  G.  H.  Golub  and  C.  F.  Van  Loan,  where  £ 

is  the  identity  of  the  £th  processor: 

B  —  A{\  :  n,  £  :  p  :  n);  pivot  =  piv(£  :  p  :  n  —  1) 

.7=1;  9  =  1;  col  =  £  :  p  :  n;  L  =  length(co/) 

while  q  <  L 

if  j  =  col(q)  A  j  <  n 

{  Find  the  permutation  index  piv(j),  and  Gauss  vector  A(j  +  1  :  n,j).  } 

Determine  k  with  j  <  k  <  n,  so  \B(k,q)\  =  \\B(j  :  n,<7)||oo 

pivot{q)  =  k;  B(j,  1  :  L)  «-»  B[k,  1  :  L) 

B(j  +  l  :n,q)  =  B{j  +  l:n,q)/B(j,q) 

Send  pivot(q),  B(j  +  1  :  n,q)  to  processors  on  the  right. 

{  Update  local  columns.  } 

if  q  <  L 

B{j  +  l:n,q+l:L)  =  B(j  +  1  :  n,q+  1  :  L) 

-B(j  +  l:n,q)B(j,q+l:L) 
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end 

J  =J  +  U   q=q+l 
else 

Receive  piv(j).  and  A{j  +  1  :  n,j)  from  the  left,  and  if  the  processor  on 

the  right  is  not  the  processor  which  computed  A(j  +  1  :  n,j),  and  j  is 

less  than  the  last  column  index  of  the  processor  on  the  right,  then  send 

piv(j),  and  A(j  +  1  :  n,j)  to  the  right. 

{  Update  local  columns.  } 

B(j,\  :L)~B(piv(j),l:L) 

B(j  +  1  :n.q:L)  =  B(j  +  1  :  n,q  :  L)  -  A(j  +  1  :  nJ)B(j,q  :  L) 

J=J  +  l 
end 

end 

After  each  processor  in  a  p-processor  ring  finishes  the  execution  of  the  above  code, 
we  will  get  the  LU  factorization  of  A;  Proc(0  will  house  A(l  :  n,£  :p:n)  in  a  local 
array  5,  and  piv(£  :  p  :  n  —  1)  in  a  local  vector  pivot.  The  above  LU  factorization  of 
A  is  of  the  form  PA  =  LR.  After  simplifying,  we  get  A~l  =  /?_1  L~x P .  and  therefore 
to  obtain  A~\  we  need  to  determine  L~l ,  R~l,  and  R~l  L~x .  First,  we  will  discuss 
how  to  find  the  inverse  B  of  a  lower  triangular  matrix  I.  where  L  is  overwritten  by 
B.  Consider  the  following  node  program: 

D=  [L(k:  n,Jfc)],  fc  =  £:p:n;  j  =  l 

q  =  \\  col  =  £  :  p  :  n;  Ar  =  length (co/) 

while  j  <  a 
if  j  =  col(q) 

iO>l 

Send  D{]  :  n.q)  to  processors  on  the  left. 


77 


end 

Update  local  columns.  } 
ifq>l 

D(j,l:q-l)  =  D(j,l:q-l)/D(j,q) 
if  j  <n 

D(j  +  1  :n,l:  q-1)  =  D(j  +  1  :  n,l  :  q-1) 
-D(j  +  1  :n,q)D(j,l  :q-l) 
end 
end 

D(j,q)  =  l/D(j,q) 
if  j  <  n 

D(j  +  l:n,q)  =  -D(j  +  l:n,q)D(j,q) 
end 

J  =  J  +  1 
else 

if  j  >  col(l) 

Receive  L(j  :  n,j)  from  the  right,  and  if  the  processor  on  the  left  is 

not  the  processor  which  houses  L(j  :  7i,j),  and  j  is  greater  than  the 

first  column  index  of  the  processor  on  the  left,  then  send  L(j  :  n,j) 

to  the  left. 

{  Update  local  columns.  } 

D(j,l:q)  =  D(j,l:q)/L(j,j) 

if  j  <  n 

D(j  +  1  :  n,  1  :  q)  =  D(j  +  1  :  n,  1  :  q)  -  L(j  +  1  :  n,j)D(j,  1  :  q) 
end 

j  =  j  +  1 
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if  col{q)  +  p  =  j 

q  =  q+l 
end 
else 

J  =  J  +  1 
end 

end 

end 

If  each  node  in  a  p-processor  ring  executes  the  above  code,  then  we  will  get  the 
inverse  of  L,  and  upon  completion  Proc(£)  will  house  L(k  :  n,  k)  for  k  =  f  :  p  :  n  in  a 
local  array  D.  Our  next  move  is  to  find  the  inverse  B  of  an  upper  triangular  matrix 
/?,  where  R  is  overwritten  with  B.  A  node  program  can  be  structured  as  follows: 
D  =  [R{\:  k,k)],  k  =  £:p:  n;  j  =  n 
col  =  £  :p  :  n;  N  =  length(co/);  q  =  N 
while  j  >  1 
if  j  =  col{q) 
if  j  <  ?! 

Send  D{\  :  j.g)  to  processors  on  the  right. 
end 

{  Update  local  columns.  } 
if  q  <  N 

D(j,q+\  :N)  =  D{j,q  +  l  :N)/D(j,q) 

D(l:j-  1,9  +  1:  -V)  =  D(l:j-1,9  +  1:  .V) 
-D(l  :j-l,9)I>(i,?  +  l  :-V) 
end 
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end 

D(j,q)  =  l/D(j,q) 

if  J  >  1 

D(l  :  j  -l,q)  =  -D(l  :  j  -l,q)D(j,q) 
end 

3  =  3  -  ! 
else 

if  j  <co/(7V) 

Receive  R{\  :  j,  j)  from  the  left,  and  if  the  processor  on  the  right  is 

not  the  processor  which  houses  R{\  :  j,j),  and  j  is  less  than  the  last 

column  index  of  the  processor  on  the  right,  then  send  R(l  :  j,j)  to  the 

right. 

{  Update  local  columns.  } 

D(j,q:N)  =  D(j,q:N)/R(j,j) 

if  J  >1 

D(l:j-l,q:N)  =  D(l:j-l,q:N)-R(l:j-  l,j)D(j,q  :  N) 
end 

3  =  3  -  1 

if  col{q)  -  p  =  j 

q  =  q-l 
end 
else 

3  =  3  ~  1 
end 

end 

end 
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If  each  processor  in  a  p-processor  ring  executes  the  above  node  program,  then 
upon  completion  Proc(£)  will  house  /?(1  :  k.k)  for  k  =  f  :p:  n  in  a  local  array  D. 
In  the  LU  factorization  of  .4,  R  can  be  stored  in  the  upper  triangular  part  of  -4,  and 
the  unit  lower  triangular  matrix  L  can  be  stored  in  the  strictly  lower  triangular  part 
of  A.  Then  we  can  use  the  above  code  to  find  the  inverse  of  the  upper  triangular  part 
of  A,  and  after  a  slight  modification  of  the  node  program,  which  finds  the  inverse 
of  a  lower  triangular  matrix,  we  can  use  it  to  find  the  inverse  of  the  strictly  lower 
triangular  part  of  .4.  Since  A-1  =  R~x L~x P .  so  our  next  goal  is  to  create  a  node 
program  to  find  the  product  R~l L~x .  Consider  the  following  node  program: 

B  =  A(\  :  n,£  :  p  :  n):  j  =  1;  col  =  £  :  p  :  n:  L  —  length(co/);  q  =  1 

while  j  <  n 
if  j  =  col(q) 

if  J  >  1 

Send  B(\  :  j,q)  to  processors  on  the  left. 

end 

{  Update  local  columns.   } 
if  q  >  1 


B(l:j-l,l  :q-\ 
0 


+  B{l:j,q)B(jil:q-l) 


li(\  :jA  :q-l)  = 
{  0  is  a  q  —  1  dimensional  row  zero  vector.  } 
end 

J  =  J  +  1 
else 

if  j  >col(\) 

Receive  A(\  :  j.j)  from  the  right,  ami  if  the  processor  on  the  lefl  is  not 
ili.'  processor  which  houses  ,4(1  :j,j).  and  j  is  greater  than  the  first 
column  index  of  the  processor  on  the  left,  then  send  4(1  :  jyj)  to 
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the  left. 

{  Update  local  columns.  } 

B(l:j-l,l:q) 


B{l:j,l:q)  = 

3  =  3  +  1 

if  col(q)  +p  =  j 

q  =  q+l 
end 


0 


+  A(l:j,j)B(j,l:q) 


else 


3  =  3  +  1 


end 
end 
end 

After  each  processor  in  a  p-processor  ring  executes  the  above  code,  we  will  obtain 
the  product  R'1  L~x  which  overwrites  A,  and  upon  completion  Proc(£)  will  house 
A(l  :  n,£  :  p  :  n)  in  a  local  array  B.  Since  A  is  overwritten  by  R~x L~x ,  and  the 
permutation  matrix  P  is  encoded  in  the  (re  —  1)  dimensional  vector  piv,  so  A'1  = 
(R~1L~1)P  can  be  computed  in  the  following  way: 
for  k  =  re  -  1  :  -1  :  1 
if  A'^piv(A:) 

A{\  :  n,k)  <->  A{\  :  re,piv(fc)) 
end 
end 

Next,  if  we  store  the  upper  triangular  matrix  U  in  the  upper  triangular  part  of  a 
n  x  n  matrix  5,  and  the  strictly  lower  triangular  matrix  G(S,  U)  in  the  strictly  lower 
triangular  part  of  B,  then  the  code,  which  is  used  to  compute  the  product  R~l  L~x 
can  be  applied  to  B  to  find  the  product  UG(S,  U).  Similarly,  it  is  possible  to  find  the 


VJ 


product  G{S,U)U  in  a  more  efficient  way.  After  computing  UG{ 5,  V).  S  '.  >'  lAS. 
and  G{S,U)U,  we  can  take  advantage  of  parallel  processors  to  compute  U(t)  = 
U  +  t(UG(S,U)  +  S-'AS  -  U  -  G{S,U)U).  Next  after  Z(t)  =  A-  S(t)U(t)S(t)-' 
is  calculated,  we  can  find  f(t)  =  Jixa.ce{Z{t)H Z(t))  in  the  following  way.   Let  x(l  : 

7? 

p)  =  0  be  a  zero  vector.    Let  Z  =  Z(t),  and  r  =  — .    To  compute  Zlt Z  we  use  the 

P 
following  procedure: 

£  =  Z(1  :n,l  +  (£-l)r:£r);  y  =  z(£) 

for  i  =  1  :  r 

y  =  y  +  [JB(l:n,i)]flr£(l  :n,i) 

end 

Here  each  processor  can  execute  the  above  code  simultaneously,  and  upon  com- 
pletion Proc(£)  houses  x(£)  in  a  local  variable  y.  Then  f(t)  =  ^/xj  +  . . .  +  xp. 

In  Step  4,  if  A'fl  —  m  <  p,  then  we  can  not  use  all  processors  to  find  il.  and  in. 
Suppose  NB  —  m  >  p.  Using  a  similar  technique,  as  described  in  the  analysis  of  Step 
1,  we  can  get  an  array  k  such  that  |/u-(')l  <  |/U-('  +  1)1  holds.  Let 


r,  = 


fik(i+i)  -  f'k(i) 
dk(,)  —  <4(,+i) 


To  determine  \r,0\  —  min{  |r,|    :    i  =  1,. . . ,  NB  —  w  —  1,   Re  (r,)  >  0  &:  |/',|  <  1   }.  we 

NB-m 


can  use  p  processors  as  follows.  Let  s 


.  Define 


B2 

B    , 
ft 


p    . 


^Jk(«+1:2«+1) 

Pfc((p-2)«+l:(p-l)«+l) 

/U((p-l)s+l:.VB-m) 


''. 

c2 



cp_, 

L    Cp    . 

dfc(l:«+l) 
Ujfc(a+1:2»+1) 

rffc<(p-2)»+l:(p-l).+l) 

'^((p-l)3+l:A'B-m) 


/  =  [  l,.s  +  1.2.s+  1 (p—  1)3  + 1  ]7',  u(l  :  p)  =  0.  and  u(]  :  /»)  =  0.  Next  consider 

the  following  procedure: 
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u  =  l(0]  L  =  length(/);  r(l  :  L  -  1)  =  0 
for  i  =  1  :  L  -  1 

/i+l  —  /» 

r,-  =  

end 

13=1;  j  =0 
for  i  =  1  :  L  -  1 

if  Re(r,)  >  0  A  |r,|  <  /? 

/?=  k,-|;  j  =  i 
end 
end 
if  .7=0 

6  =  0 
else 

8=u-l+j 
end 

If  each  processor  executes  the  above  code,  then  upon  completion  Proc(£)  houses 
u(£)  =  f3,  and  u(f)  =  6.  Next  the  desired  minimum  is  u{^0)  =  min{  *<(£),  £  = 
1, . . .  ,p  },  and  if  the  corresponding  v(£0)  ^  0,  then  define  il  =  t  -  1  +  mm{k(v(^0)), 

fc(u(6>)  +  1)},  and  iu  =  t  -  1  +  max{fc(t;(6,)),*(i;(60  +  !)}• 

In  Step  5,  we  can  use  parallel  processors,  as  described  in  the  analysis  of  Step  2, 
and  Step  3  to  obtain  Unew,  Snew,  and  Z  =  SnewUnew  (5neu,)_1. 

In  Step  6,  if  m  <  p,  then  we  can  assign  a  separate  processor  to  implement  the  code 
to  each  diagonal  block  Ujj,  1  <  j  <  m.  If  m  >  p,  then  assign  Proc(£)  to  implement 
the  code  to  diagonal  blocks  Ujj,  where  j  e  f  :  p  :  m.  Thus  we  have: 

col  =  £  :  p  :  m;  L  =  length(co/) 

for  q  =  1  :  L 
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a  —  col(q);  Implement  Step  6  to  UQQ 
end 

In  Step  8,  we  can  use  parallel  processors  to  find  a  QR  factorization  of  5.  Here  we 
will  discuss  how  to  find  a  QR  factorization  of  a  matrix  ,4  in  a  distributed  memory 
multiprocessor,  where  processors  are  interconnected  in  a  ring.  Let  6(1  :  n  —  1)  =  0 
be  a  zero  vector.  Consider  the  following  procedure,  which  is  due  to  G.  H.  Colub  and 
C.  F.  Van  Loan: 

B  =  .4(1  :  n,(  :  p  :  n);  c  =  b(£:p:  n  -  1) 
j  =  1;  q  =  l]  col  =  £  :  p  :  n;  L  —  length(co/) 
while  q  <  L 

if  j  =  col(q)  A  j  <  n 

{  Find  Householder  vector  A(j  +  1  :  ntj).  } 

a=\\B(j:n,q)\\2 

if  a  =  0 

B(j  :  7?,r/)  =  0;  quit 


else 


P  =  <*B(j,q)/\B(j,q)\ 
else 

0  =  a 

end 

(3  =  B(jtq)  +  ii-   B(j  +  1  :  n,q)  =  B(j  +  1  :  ntq)/fi  B(j,q)  = 


— o 


end 


v  = 


1 


B(j+  1   :».<i) 


;  c(q)  =  — 77- 

v"  V 
Send  c(g),  and  B(j  -f  1  :  n,q)  to  processors  on  the  right. 

{  Update  local  columns.  } 
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if  q  <  L 

B(j  :n,q  +  l:L)  =  B{j  :  n,  q  +  1  :  L)  +  c{q)vvHB{j  :n,q  +  l:L) 
end 

j  =j  +  l;  9  =  9  +  ! 

else 

Receive  6(j),  and  A(j  +  1  :  n,  j)  from  the  left,  and  if  the  processor  on  the 

right  is  not  the  processor  which  formed  A(j  +  1  :  n,j),  and  j  is  less  than 

the  last  column  index  of  the  processor  on  the  right,  then  send  b(j),  and 

A{j  +  1  :  n,j)  to  the  right. 

{  Update  local  columns.  } 

1 
V~  [A(j  +  l:nJ) 
B(j  :n,q:L)  =  B(j  :  n,q  :  L)  +  b(j)vvHB(j  :  n,q  :  L);  j  =  j  +  1 

end 

end 

After  each  processor  in  a  p-processor  ring  executes  the  above  code,  we  will  get  a 
QR  factorization  of  A,  and  Proc(f)  will  house  A(l  :  n,£  :  p  :  n)  in  a  local  array  B, 
and  b(£  :  p  :  n  —  1)  in  a  local  vector  c.  The  upper  triangular  part  of  A  is  overwritten 
by  the  upper  triangular  part  of  R,  and  components  j  +  1  :  n  of  the  jth  Householder 
vector  are  stored  in  A(j  +  1  :  n,j),  j  <  n.  Here  we  need  an  explicit  form  of  the 
unitary  matrix  Q.  We  find  Q  using  the  following  procedure,  which  uses  Householder 
vectors  A(j  +  1  :  n,j),  where  j  <  n,  and  the  vector  b  to  form  Q  explicitly,  and  zeros 
subdiagonal  elements  of  A.  Let  Q  =  /„: 

B  =  A{\  :n,i  :  p  :  n);  /  =  Q{\  :n,£:p:n);  c  =  6(£  :  p  :  n  -  1) 

j  =  n  —  1;  col  =  £  :p:  n;  L  =  length(co/);  <?  =  L 

while  j  >  1 
if;  =  col{q) 
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Send  c(q),  and  B(j  +  1  :  n,q)  to  procesors  on  the  left 

s.   } 
;   B(j  +  l  :n,g)  =  0 


{  Update  local  columns.  } 
1 
B(j  +  \  :n,q) 
I(j  :n,q:  L)  =  I(j  :n,q:  L)  +  c(q)vvHI(j  :n,q:  L)\  j  =j  -  1 


else 


Receive  b(j),  and  A(j  +  1  :  n,  j)  from  the  right,  and  if  the  processor  on  the 
left  is  not  the  processor  which  houses  A(j  +  1  :  n.j),  and  j  is  greater  than 
the  first  column  index  of  the  processor  on  the  left,  then  send  b(j),  and 
A(j  +  1  :  n,j)  to  the  left. 

{  Update  local  columns.  } 

1 
A(j  +  l  :nj) 

I(j  :  n,q  :  L)  =  I(j  :  n,q  :  L)  +  b(j)vv"l(j  :  n,q  :  L)\  j  ==  j  -  1 
if  co/(</)  -  p  =  j 

q  =  q  -  1 
end 
end 
end 

If  each  processor  in  a  p-processor  ring  executes  the  above  code,  then  upon  com- 
pletion Proc(£)  will  house  A(\  :  k,k)  for  k  =  £  :  p  :  n  in  a  local  array  B,  and 
Q(l  :  n,£  :  p  :  n)  in  a  local  array  /.  Next  we  can  use  parallel  processors,  as  d«'MTil>cd 
in  the  analysis  of  Step  2  to  find  the  product  Unew  =  QHZQ. 

Finally  in  Step  9,  we  can  use  multiprocessors,  as  described  in  the  analysis  of  Step 
3  to  evaluate  /(0)  =  \\A  -  Z\\F. 


CHAPTER  3 
EIGENVALUES  OF  SYMMETRIC  MATRICES 

3.1     Diagonalization  of  a  Symmetric  Matrix  using  Armijo's  Stepsize 

Let  A  be  an  n  x  n  symmetric  matrix.  Suppose  the  eigenvalues  of  A  are  all 
distinct.  In  this  section  we  will  find  an  iterative  method  to  compute  a  diagonalization 
A  =  Q\QT,  where  A  =  diag(  Xu  A2,  •  •  • ,  An  ),  and  Q  is  an  orthogonal  matrix  such 
that  the  diagonal  elements  of  A  are  the  eigenvalues  of  A,  and  the  columns  of  Q  are 
eigenvectors  of  A.  Consider  the  following  algorithm  to  compute  the  eigenvalues,  and 
corresponding  eigenvectors  of  a  matrix  A.  Detail  is  given  in  [HagSS,  page  304  -  14], 
or  in  Chapter  1: 

Xy™     =     (yfd)T  Ax°'d  for  j  =  1  ton, 

n    (v°lA     Arold 
x™     =     x-W  +  y^U 1-xf    forj  =  lton,  (3.1) 

*J  3         '     l—d     \neui  \new       '  J 

\'new      /  vnew\  —  1 

Algorithm  in  (3.1)  can  be  rewritten  in  the  following  way: 

Anew     =    D^.g((xoldy1  AXM\ 

ynew  yold  ( j    ,     p  /yoW     AneuA\ 

where  Diag(M)  =  diag(  mn,  ???22,  •  •  • ,  m-nn  ),  and 

0  for  i  =  j, 

([x^Y'ax0 


(3.2) 


fiAxold,knew)  =  < 


\  new  \  r 

A33       ~  Ai 


Id 

—    for  i  ^  j. 
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In  (3.2).  if  we  put  J"1""  =  I  +  F  (XoU,Anew),  and  AM  =  (XM)    '  AX0,d,  then  it 
reduces  to  the  form: 

A"""     =    D\ag(  (Y^y1  AoldYolA, 

ynew  yoldynew 

Hence  in  each  iteration,  we  can  update  the  eigenvalues,  and  corresponding  eigenvec- 
tors in  the  following  way: 

Anew       (\'old\  ,\old\-old 

A—     =    Diag(/r- ),  f33l 

ynew  j     ,     j?  /  Anew\ 

■ynew       yoldynew 

where 

'  0  for  i  =  j. 

fij(An™)=< 


for  ?  /  j. 


nnew  r.new 


The  updating  process  in  (3.3)  does  not  restore  the  symmetry  of  the  given  matrix 
A,  in  the  update  Anew  =  (YoIJ)  AoldYold.  In  each  iteration  to  restore  the  symmetry 
of  A,  and  the  orthogonality  of  X  we  do  the  following.  Let  Ynew  =  Qnew Rnew  be  a  QR 
factorizat  ion  of  YnfW.  Now  if  we  take  Anew  =  (Qold)T  AoldQold.  and  Aneu'  =  XMQncw, 
then  Ancw  will  be  symmetric,  and  Xnew  will  be  orthogonal.  With  these  modifications, 
(3.3)  becomes: 


T 


Anew  _  ^Qold^       AoldQold^ 

Anew  =  Diag(/ret"), 

ynew  _  /  +   /.    (/J"""^  (3.4) 

r*  =  QnewRiew,      (  QR  factorization  ) 

ynew  yolds^new 


where 


/„  (.■!"-)  = 


for  z  =  j, 


" 


a"™ 
? for  i  ?  j. 

-.new  '      J 


If  the  starting  guess  matrix  of  eigenvectors  Qo  is  not  a  good  approximation  of 
Q   in   the  factorization   A    =    Q\Q' ''.   then   in   the  update   V"'"     =    /  +  /•' (.•!"' "  ). 
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the  increment  F  (Anew)  may  be  too  large.  However  with  a  large  increment,  we 
may  likely  have  instabilitiy  in  the  updating  process,  and  the  algorithm  may  di- 
verge. To  restore  the  convergence  of  the  algorithm,  and  to  have  a  steady  change 
in  the  value  of  Ynew ,  we  need  a  small  increment  in  each  iteration.  To  achieve 
this,  we  redefine  each  iterate,  and  introduce  a  small  positive  parameter.  Let  s  be 
a  positive  parameter,  and  define  Z(s)new  =  I  +  sF(Anew).  So  Z{0)new  =  I,  and 
Z(l)neu'  =  Ynew,  the  matrix  generated  by  (3.4).  Define  f(s)  =  \\G{s)\\F,  where 
G(s)  =  lotr  ([Z(s)otdy1AoldZ{s)old\  lotr(5)  is  a  strictly  lower  triangular  matrix 
formed  from  the  strictly  lower  triangular  part  of  B.  Suppressing  the  superscripts  in 
Z(s)new,  and  G{s),  we  have  Z{s)  =  I  +  sF(A),  and  G{s)  =  \ott{Z{s)-1AZ{s)). 
Hence  /(0)  =  ||lotr  (.4)  ||F,  and/(l)  =  ||lotr  (Y~lAY)  \\F. 

With  a  starting  guess  A'0,  if  the  algorithm  in  (3.4)  converges,  then  we  must  have 
/(l)  <  /(0).  As  in  Chapter  2,  here  our  goal  is  to  find  an  s,  0  <  s  <  1,  for  which 
f{s)  ^  /(0)  holds.  To  this  end,  we  use  Armijo's  rule  from  optimization  theory.  In 
Armijo's  rule,  we  determine  s  in  the  following  way.  Evaluate  f(s)  at  s  =  1,  |,  |,  ■  •  • , 
stopping  when 

f(s)  <  (l  -  0  /(0). 

To  use  the  above  inequality,  f(s)  must  satisfy  /'(0)  =  — /(0).  Hence  our  next  goal  is 
to  evaluate  /'(0),  when  f(s)  =  ||G(s)||j?  with  G(s)  =  lotr  (Z(s)"MZ(s)).  Before  we 
find  /'(0),  we  will  prove  the  following  basic  result. 

Lemma  3.1.1  If  A  £  Wixn  is  a  symmetric  matrix,  and  B  G  Rn* "  is  a  skew  symmetric 
matrix,  then  AB  —  BA  is  a  symmetric  matrix. 

Proof  of  Lemma  3.1.1  Let  A  be  a  symmetric  matrix,  and  B  be  a  skew  symmetric 
matrix.  So  AT  =  A,  and  BT  =  -B.  Now 

{AB-BA)T    =    {AB)T-{BA)T    . 
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=    BTAT-ATBT 
=    (-B)A-A(-B) 
=    AB-BA 

Hence  AB  —  BA  is  a  symmetric  matrix.     □ 

With  the  above  definition  of  f(s),  now  we  are  ready  to  find  /'(0),  and  will  show 
that  /'(0)  =  -/(0). 

Theorem  3.1.1   Let  A  (E  R"*  l   be  a  symmetric  matrix.    Suppose   its  eiyenvaluts  are 

all  distinct    Define  Z{$)  =  I  +  sF(A),  G(s)  =  lotr  ((Z(s))-1  AZ(s)) ,  and  f(s)  = 

\\G(s)\\f,  where 

(  0  for    j  =  j, 

h(A)=\   -^-    for   ,+1. 

I    Cjj  ~  Oft 

lotr[B)  is  a  strictly  lower  trianyular  matrix  formed  from  the  strictly  lower  triangular 
part  of  B.   Then  /'(0)  =  -/(0). 

Proof  of  Theorem  3.1.1  Using  the  result  (2.18)  of  Lemma  2.3.1,  \vc  have 

/(0)/'(0)  =  trace  (g(0)T  ^G(s)\s=Q)  . 

Differentiating  G{s)  with  respect  to  s,  and  then  evaluating  the  derivative  at  5  =  0, 
we  obtain 


(Is 


J(s)\t=0  =  lotr  f-Z(0)-,-^Z(5)|3=oZ(0)-1/lZ(0)  +  Z(0)-U^Z(5)|s=oy  (3.6) 


Differentiating  Z(s)  with  respect  to  .s.  we  have  —Z(s)  —  F.  Hence  —  Z(s)L=o  =  F. 

da  ds 

Substituting  this  value,  and  the  value  of  Z(0)  =  I  in  (3.6),  and  then  simplifying  we 

get 

^-G(s)l,=0  =  \olv(AF-FA).  (3.7) 

ds 
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Since  G(0)  =  lotr(Z(0)"MZ(0))  =  lotr  (JAJ)  =  lotr(A),  so  (3.5)  becomes 

/(0)/'(0)  =  trace  ((lotr(A))T  lotr  (AF  -FA)).  (3.8) 

Because  A  is  symmetric,  so  (lotr  (A)f  =    uptr  (A) ,  where  uptr(A)  is  a  strictly  upper 
triangular  matrix  formed  from  the  strictly  upper  triangular  part  of  A.   Hence  (3.8) 

reduces  to 

/(0)/'(0)  =  trace  (uptr  {A)  lotr  (AF  -  FA)) .  (3.9) 


Next,  for  i  ±  j,    fi,   = — ,  and  fJt 


aji 


Cljj         0-H 


ati  -  a 


-.     Since  A  is  symmetric,  so 


jj 


f.    =  _yjt..  That  is  F  is  a  skew  symmetric  matrix.  Hence  by  Lemma  3.1.1,  AF-FA 
is  symmetric.  Now  AF  —  FA  = 


IL      9a2.  JL   aual2      .  ^   aXiai2  '^    fliifl™      .  ^  auam 


E 


S'l 


la 
2  an  -  a 


E-^-+E- 


n       9  r/2 
y      "  a»2 

-2all  —  a«'«         ^Tfl22  -  aii  ,  =  ia22  —  Oft 

,^2  1*2 


^    «*2t«*»l  Y1. 


E- 


O^niCln 


n-1  ._. 

^-^      umuzl 


aniai2 


n-1 


^— v    "m^ii        .    V-^ 


nnia%2 


~iau  —  an     :_Aann  —  an  ,=ia22  —  aii     l=1ann      an 

l~l  .*2 


n-1 
-,a-nn  —  O-ii        ,  =  2all   —  °« 


+£: 


i=l 
n-1 


a2iain 


&2iain 


^—^     u-2iuin  ^"^ 

1=1«nn  —  «it        ,=ia22  —  a« 
<?!2 


^     2  a2 


,_jOnn        Qjt' 


Therefore  (3.9)  gives: 


/(0)/'(0)  = 


trace 


akini\ 


(lklnil 


k=2 


X 


X 

X 


n  In 

X>2*  £ 

/:=3 


X 


akiUi2 


+E 


afc;a,2 


\,=i  a22  -  a..      ,=1  afcfc  -  an 

\„£2  i*k 


X 
X 
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an-ln 


£ 


X 
X 


OniO,n_l 


n-1 

+  £ 


aniain-l 


,  =  i      Gn-ln-1    _  Oti  1  =  1    Orln  —  O,, 


A' 
A 


X 


A  0   J/ 

where  A'  are  numbers.   Simplifying  the  right  hand  side  of  the  above  expression,  we 

get 


/(0)/'(0) 


£«i*|£ 

k=7 


ak,ati 


+  £ 


ajt.a,] 


ajuai2 


=2  "n  -  «..        ,=i  "kk  -  «..  /       ^         \  ~;  a22  -  a, 


+  £ 


akia,2 


( 


,=i  akk  -  a. 


+  •  ■  •  +  a-i-in 


£ 


"m"in  — 1 


n-1 


■  =  1       an-ln-l    —  «i 


+  £ 


OniO,n_l 


—  j    "nn         O,, 


/    (1220.21  a2i«n    \  /    anr.a„i  aniau    \ 

an    + )  +  •  •  •  +  ai„     + 

\an  -  a22       «22  — an/  Von-ann       an„-ai,/ 

/     a33«32  «32fl22      \  /     ann«n2  an2a22      \ 

+  «23      +  )   +  ■■■  +  <l2n(  +  

\a22-(l33  033-^22/  \«22-«nn  Onn-a22/ 

Von  - 


/       dnndnn-l  ,    Onn_j ^n-ln-l  \    , 

H hOn-lnl  l+«12ai3a23 


+ 


1 


+ 


/'n-ln-1  _ Onn        "nn-  an-ln-l; 

1111 
+ + + 


G22  —  «33        «22  —  Oil         033  —  Oll         On   —  a22        O33  —  <J22, 

/                  1                                           1  1 

On-2n-lO„_2,iOn-l„  | 1 (- 


11   —  O33 
+  ■"  + 


+ 


1 


+ 


^On_2n-2  — Onn        On_in_i—  Onn        an_i„_i — an_2„-2 
1  1  \ 


+ 


Onn  —  On_2n-2  On_2n-2  _  On_i„_i  dnn  —  (ln-\n_i  ) 

=  -  E  "•■ 

l<i<j<n 

=  -/(0)2. 

So  /'(0)  =  -/(0),  provided  /(0)  ^  0.    D 

Hence  Theorem  '5.1.1  implies  thai  Armijo's  rule  can  be  applied  to  the  algorithm 
in  (3.4). 

A  detailed  sketch  of  the  algorithm  to  find  the  eigenvalues  of  a  symmetric  matrix 
.1.  whose  eigenvalues  are  all  distinct  is  as  follows: 
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Algorithm  3.1.1  (Symmetric  Diagonalization)  Given  a  symmetric  matrix  A  G  Rnxn, 
whose  eigenvalues  are  all  distinct,  an  orthogonal  matrix  Xq,  and  a  tolerance  tol 
greater  than  the  unit  roundoff,  the  following  algorithm  computes  a  diagonalization 
X AX  =  A,  and  A  is  overwritten  with  the  diagonal  matrix  A. 

Step  0:  Take  A'  =  X0.  Compute  Anew  =  XTAoldX,  and  /(0)  =  ||  lotr  (A)  \\F. 
Step  1:  Construct  F  =  (fij),  where 

f   0  for  i  =  j, 

fij  =  a^         for  i  ^  j. 

Step  2:  Let  f(s)  =  ||G(s)||F,  where  G(s)  =  lotr  ((/  +  sF)'1  A(I  +  sF)).  Evalu- 
ate f(s)  at  s  =  1,  |,  |,  ■  ■  ■ ,  stopping  when 

/(*)<  (l  -  |)  /(0).  (3.10) 

Let  t  be  the  first  value  of  s  for  which  (3.10)  holds,  and  let  Y  =  I  +  tF .  Let  Y  =  QR 
be  the  QR  factorization  of  Y . 

Step  3:  Update  A,  and  A"  as  follows.  Anew  =  QTAoldQ,  and  Xnew  =  XMQ. 

Step  4:  Evaluate  /(0)  =  ||lotr(/l)  \\F.  Go  to  Step  1  until  /(0)  <  tol,  where  tol  is 
the  tolerance  for  the  desired  accuracy  of  the  eigenvalues. 

Example  3.1.1  If  Algorithm  3.1.1  is  applied  to  the  matrix 


A  = 


with  A'o  =  /,  and  tol  =  10  6,  then  after  6  iterations  we  obtain  a  diagonalization 
XTAX  =  diag(  2,  3.87298,  -3.87298  ).  If  we  use  the  QR  method  for  a  symmetric 
matrix  with  single  implicit  shift  to  A,  then  after  5  iterations  we  obtain  a  diagonaliza- 
tion QTAQ  =  diag(  -3.87298,  3.87298,  2  ).   Next  consider  the  following  matrix  Au 


2 

-1 

3 

1 

1 

2 

3 

2 

-1 
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which  we  obtain  by  perturbing  the  elements  of  A 

2.002     -1.001 
A1  = 


-1.001 
3.003 


1.001 
2.002 


2.002 
1.004 


If  we  use  Algorithm  3.1.1  to  Ax  with  X0  =  X,  and  to!  =  10-,i.  thru  after  2  iterations 
we  obtain  a  diagonalization  YT AXY  =  A,  whereas  if  we  apply  the  QR  method  for  a 
symmetric  matrix  with  single  implicit  shift  to  XTA\X,  then  after  3  iterations  we  get 
a  diagonalization  ST(XTAXX)S  —  D. 

Example  3.1.2  If  Algorithm  3.1.1  is  applied  to  the  matrix 

5  =  Qdiag(l,2,3,4,5,6)<?T, 

where  Q  is  an  orthogonal  matrix,  with  A'0  =  /.  and  tol  =  10~6,  then  after  7  iterations 
we  obtain  a  diagonalization  XTBX  =  diag(  5,  6,  2,  1.  1,  3  ),  whereas  if  we  apply  the 
QR  method  for  a  symmetric  matrix  with  single  implicit  shift  to  H.  then  after  11 
iterations  we  get  a  diagonalization  P7 DP  =  diag(  6,  5,  1,  4.  2,  3  ). 


Example  3.1.3  If  Algorithm  3.1.1  is  applied  to  the  matrix 

6       4       2       2    -1       3 


C  = 


with  A'o  =  /.  and  tol  =  10~6,  then  after  7  iterations  we  obtain  a  diagonalization 
XTCX  =  A.  If  we  apply  the  QR  method  for  a  symmetric  matrix  with  single  implicit 
shift    to  C,  then  after   10   iterations  we  get  a  diagonalization   QTCQ   =    D.     Next 

consider  the  following  matrix  <",,  which  we  obtain  by  perturbing  the  elements  of  C: 

2.006  -1.002  3.007 

2.009  -2.003  -3.004 

-2.002  -1.001  -1.001 

1.002  -6.00S  0.004 


1 

4 

1 

2 

_2 

-3 

2 

1 

3 

_2 

-4 

-1 

2 

2 

_2 

1 

-6 

0 

1 

_2 

-4 

-6 

-5 

4 

3 

-3 

-1 

0 

4 

—2 

<\ 


(,. 1)111 

1.006 

2.007 

4.006 

1.007 

1.004 

2.007 

1.001 

3.001 

2.006 

2.00!) 

-2.002 

-1.002     -2.003     -4.004     -6.008    -5.007        1.001 
3.007     -3.001     -1.001        0.00  1        1.001     -2.1)0  1 
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If  we  use  Algorithm  3.1.1  to  C\  with  A'0  =  X,  and  tol  =  10-6,  then  after  3  iterations 
we  obtain  a  diagonalization  YT  A\Y  =  fi,  whereas  if  we  apply  the  QR  method  for  a 
symmetric  matrix  with  single  implicit  shift  to  A'TCiA',  then  after  10  iterations  we 
obtain  a  diagonalization  S  (X  CiX)S  =  T. 

3.2     Block  Diagonalization  of  a  Symmetric  Matrix  using  Armijo's  Stepsize 

Let  A  be  an  n  x  n  symmetric  matrix.  Our  goal  is  to  find  an  iterative  method 
to  compute  a  block  diagonalization  A  =  QAQT ,  where  A  =  diag(  Al5  A2, .  .  . ,  A^  )  is 
a  block  diagonal  matrix,  and  Q  =  [  Qi,  Q2,  ■  ■  ■ ,  Qk  }  is  a  compatible  block  column 
orthogonal  matrix  such  that  AQj  =  QjA.j.  Here  Aj  =  diag(  Aj,  Aj, . . . ,  Aj  )  is  an 
rtij  x  nij  scalar  matrix.  It  can  always  be  arranged  so  that  the  eigenvalues  of  A,-,  and 
Aj  for  i  7^  j  are  distinct. 

From  the  block  diagonalization  of  a  matrix  A  in  Section  2.1,  we  have  the  following 
locally  quadratically  convergent  algorithm: 

Anew       _       (Yf'd)TAXfd  for    j  =   1     tO    k, 

k 

Xnew    =    xf  +  Y.X?dp<3    far  j  =  1  to*,  (3.11) 

1  =  1 
'*] 


where  P  =  PtJ  is  the  solution  to  the  matrix  equation  PA]ew  -  A™P  =  (Y?ld)T  AXfd, 
and  (Yj°ld)     is  the  jth  block  row  of  Yold.  Since  A"61",  and  A"61"  are  scalar  matrices,  so 


old 


(Yfld)    AXf 
Pij  =  ew ew    .    After  simplifying  the  algorithm  in  (3.11)  to  compute  a  block 

XTW  ~  x?ew 

diagonalization  of  a  symmetric  matrix  A,  we  have 

\new     =    Diag((A^)_1AYoW), 

jrnew      _      X°ld  (i  +  F  (X°ld    AneW)) 

where  Diag(£?)    =    diag(  Bu,  B22,  ■  ■  ■ ,  Bkk  )   with  B3J    =    diag(  b3,  6^, . . . ,  6,  )   is  an 
mjXrrij  scalar  matrix,  and  F  (Xold,  Anew)  is  a  block  matrix  such  that  F]}  (Xold,  Anew) 
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is  an  irij  x  77?  ^  zero  matrix,  and 


((.Y-)-'.4.Y-) 
Fa  (XM,  A"")  = : : -2     for  i  #  j, 

■■»   \  /  \neui  \new 


where  ((A'0")    '  AXold\     is  the  (ij)  block  of  (A'oW)    '  AXM.   In  (3.12),  if  we  put 
y»«"  =  /  +  F  (X0,d,Anew),  and  /lo/d=  (A0,d)_1  AXM,  then  the  algorithm  reduces  to 

Anew     =    Dmg  ((y^y1  AoldYolA  , 

vnew  voldynew 

Hence  the  iterative  method  to  find  the  eigenvalues,  and  corresponding  eigenvectors 
of  a  matrix  A  can  be  rewritten  as  follows: 

Anew  l\old\~      Aold-yold 

Anew  T\:~~f  /l"eti/\ 

=    Diag(>l       ),  (313) 

ynew       _      J-Lf'(^new) 
vnew      vold\mew 

where   Anew   is   a   k  x  k   block   matrix   with    A^w    =    diag(  a;tu',«;c"■ a*"*), 

Anew 

Fa  (Anew)  is  a  771,  x  m,  zero  matrix,  and  Fu  (Anew)  = ^ for  i  4  j. 

JJ  v  '  J  J  'J  \  J  .new  nnew  '    J 

uj  "i 

If  the  starting  guess  matrix  of  eigenvectors  Ao  is  not  a  good  approximation  of  Q 
in  the  factorization  A  =  QAQT,  then  the  increment  F{Anew)  in  Ynew  =  /+  F(A'""  ) 
may  be  too  large.  However  with  a  large  increment,  we  may  likely  have  instability 
in  the  updating  process,  and  the  algorithm  may  diverge.  To  restore  the  convergence 
of  the  algorithm,  and  to  have  a  steady  change  in  the  value  of  Ynewi  we  nod  a 
small  increment  in  each  iteration.  To  achieve  this,  we  redefine  each  iterate,  and 
introduce  a  small  positive  parameter.  Let  a  be  a  positive  parameter,  and  define 
Z{a)nem  =  I  +  sF(Anew).  Then  Z{0)new  =  I.  and  Z(l)"""  =  )'ncw.  the  matrix 
generated  by  (3.13).  Define  G{s)  =  nondiag  ((Z(s)0,d)~i  AMZ(s)M\  and  f(s)  = 
\\G(s)\\p\  where  nondiag(2?)  is  a  block  matrix  formed  from  the  k  ■  k  block  matrix 
D  by  replacing  the  diagonal  blocks  by  zero  matrices.  Suppressing  the  superscripts  in 
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Z{s)neu\  and  G(s),  we  have  Z(s)  =  I  +  sF(A),  and  G(s)  =  nondiag  (Z(s)"MZ(s)). 
Hence  /(O)  =  ||nondiag  (A)  ||F,  and  /(l)  =  ||nondiag  (Y~lAY)  \\F. 

With  a  starting  guess  A'o,  if  the  iterations  in  (3.13)  converge,  then  we  must  have 
/(I)  <  /(0).  As  in  the  previous  section,  our  goal  is  to  find  an  s,  0  <  s  <  1,  for  which 
f{s)  <  /(0)  holds.  To  this  end,  we  use  Armijo's  rule  from  optimization  theory.  In 
Armijo's  rule,  we  determine  s  in  the  following  way.  Evaluate  f(s)  at  s  =  1,  |,  |, . . ., 
stopping  when 

/(*)<  (l-|)/(0). 

To  use  the  above  inequality  we  must  make  sure  that  f(s)  satisfies  /'(0)  =  — /(0). 
Hence  our  next  aim  is  to  determine  /'(0),  when  f(s)  =  \\G(s)\\f,  and  G(s)  = 
nondiag  ((Z(5))_1  AZ(sj). 

Theorem  3.2.1  Let  A  €  Rn*  l  be  a  k  x  k  block  symmetric  matrix  such  that  A]3  = 
diag{  a,j,  ay, . . . ,  a,j  )  is  an  rrij  x  rrij  scalar  matrix.  If  we  define  Z(s)  =  I  +  sF(A),  G(s) 

—  nondiag  ((Z(s))~1  AZ(s)),  and  f(s)  =  \\G(s)\\f,  where  Fjj(A)  is  an  rrij  x  nij  zero 

A 

matrix,  FlJ(A)  = —  for  i  ^  j,  and  nondiag (B)  is  formed  from  the  k  x  k  block 

a  j  —  cii 

matrix  B  by  replacing  the  diagonal  blocks  by  zero  matrices,  then  /'(0)  =  — /(0). 

Proof  of  Theorem  3.2.1  The  result  (2.18)  of  Lemma  2.3.1  gives 

f(0)±f(s)\s=o  =  trace  ^'(O^G^u)  .  (3.14) 

Differentiating  G(s),  and  Z(s)  with  respect  to  s,  and  then  evaluating  the  derivatives 
at  s  =  0,  we  get 

^G'(s)U0    =    nondiag('-Z(0)-1^Z(5)|3=0Z(0)-1/lZ(0)  +  Z(0)-1/l^Z(5)|^o)  , 

^-Z(s)\3=0   =   F 

as 

(3.15) 
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Since  Z(0)  =  /,  so  after  simplification  (3.15)  gives 

d 

-TG{s)\.=o  =  noTid\&g{AF-FA).  (3.16) 

as 

Next  G(0)  =  nondiag(Z(0)-MZ(0))  =  nondiag  (7.47)  =  nondiag(.4).    Hence  from 
(3.14),  we  obtain 

/(0)-r/(s)|s=o  =  trace  ((nondiag(,4))Tnondiag(.4F-F/l)).  (3.17) 

But  A  is  symmetric,  so  (nondiag(.4))     =  nondiag(yl).  Hence  (3.17)  reduces  to 

d 
f(0)  —  f(s)\,=o  =  trace  (nondiag(A)nondiag(.4F-  FA)).  (3.18) 

ds 

Next,  for  i  ^  j,  FXJ  = — ,  and  7\,  = — .   Since  A  is  symmetric,  so  FtJ  = 

a.j  —  a,  a,  —  a.j 

—  Fjt.  That  is  F  is  a  skew  symmetric  matrix,  and  therefore  by  Lemma  3. 1.1,  AF—FA 
is  a  symmetric  matrix.  Next 


AF  -  FA  = 

y>  2  AuAji  X  AUA,2       *    AltAl2         y!  AuAik     JU  AuAik 

•*2 

£  A2iAn     JL,  /12,/1,i  *   2/l2,/l,2  y!  A2lA,k      *    .42l/l,fc 


~]"i-«.      ,71  «2  -  «.  TT7a2-af-  *TJ  ak  -  a,      -—;  a2  -  a, 

i*2  1*2  i*2 

~"2  «i  -  a.      ~,  a*  ~  a>   -t  a2  -  «i     ~[  ak  -  a.  JrJ   ak  -  a, 

1*2 


Therefore  from  (3.18),  we  have 

/(0)^/(*)U  = 
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trace 


:     A--A-,  k     A  ■■  A-, 

V*  A     I  V^     Jt        -i-  V^ 


i=2 


t  =  2 


a\  —  at 


X 


>*3 


X 


^.  Aj{A{2       ^-^  Aj{Ai2 


5>w  Er^  +  E 


J^2 


X 

X 


,=i  a2  —  «i       1=1  a3  —  a,i 


X 
X 


X 
X 


El    ^-^    AjiAik-i      ^^.AjiAik-i 
Ak-l3\    2^    ~ —+1^- 


X 


X 
X 


X 


fc-1 


fc_1  A   ■  A-,        k    A     A 

E^IEt^+E  J 


i=i 


,=i   ak  —  ai       ,=i  CLj  —  0,{ 


where  X  are  rectangular  matrices.  After  simplifying  the  right  hand  side  of  the  above 
expression,  we  obtain 

k 


f(0)TJ(s)\s=o 


trace  |  £  Axj     £  -*—  +  Jj  -i— 

_2  I  t=2  "l         ^*  ,=1    ^J         ^! 


+trace     E  ^     E  „ 


Aj{Ai2       ■s—^  AjiA{2 

+  E 


,=1  aj  —  a, 


+  trace     Afe_i*       }2   +  E 

X~i    afc_!  -  a,       ^    ak  -  at 

Gj  —  a2         0,2  —  d\/  / 

AkkAkl       AklAn 
ai  -  afc       a*  -  ax 

^33^32      .     A32A22 


=    2 

+trace  (  /lu- 
+ trace  f  A23 
+ trace  ( A2fc 


+ 
a2  -  a3       a3  -  a2 

AkkAk2      Ak2A22 

a-2  -  ak       ak  -  a2 


+  ■■■ 

+  ■■■  + 
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.         /,         ( AkkAkk-\    ,   Akk-\Ak-\k- 1  \\ 
trace    Ak-u     + +  •  ■  • 

V  \flJk-i  —  a*  ak  —  ak-\     // 

+trac^12(^^l  +  ^^l))+trace(^13  (i^+i^Y) 

V  \ax  —  a3      a2  —  a^J  J  V        \a!  —  a2      «3  —  02// 

\         \a2  —  ai       a3  —  a} /  / 

/  (  Ak-lkAkk-2    .    Ak-\k  Akk-l  \  \ 

+trace    Ak-2k-i     + 

V  V   (ik-2  -  ak  ajt_i  -  ak  )  j 

(  .         ( Akk-\Ak-\k-2    ,   Akk-\Ak-\k-2 
+tracc     Ak-2k h 


( Ak-2k 


dk-2  —  dk-\  <lk  —  flfc-1 

1  'Akk-2Ak-2k-\  ,  Akk-2Ak-2k-\ 
+ trace  \Ak-\k h 


(3.19) 

ak-i—Uk-  2  Uk  —  dk-2 

Since  A, j  =  Ajt,  Ajj,  and  An  are  scalar  matrices,  so  /tjjA,,  =  OjAji,  AJtA„  =  a./lj,, 

and 

Ajj  Ajj        AjiAu  \  I   djAji  OtAji   \  j 


ya,  —  a  j       a  j  —  a,  J  ya,  —  a}       a,  —  a:  j 


Hence 


Next  we  know  that  for  any  matrices  C,  and  D,  trace  (C)  =  trace  (CT).  and 
trace  {CD)  =  trace  (DC),  provided  the  matrix  multiplications  exist.  Let  0  = 
tr&ce  (A{j A ji Ah).  Then  clearly  0  =  trace  l(AijAjiAu)T\  =  trace  f  A* AjtAfA  = 
trace  ( An  Ai  j  A  ji).  Also  0  =  trace  (A<j  (AjiAu))  =  trace  ((AjiAi,)  .1,,) .  Hence 

/       (AjiAu       AnAuW  (  .    (AijAj,       AijAjiW 

1  race    A,j    — +  — +  trace    An    -^—^  + 


a,  —  «;       a,  —  a  1  I  I  \       Va,-  —  a,       «/  —  a. 


1   Ai,A{j        AhAjj 
+trace    Ajt + 


Kdj  —  a,       at  —  a, )  ) 

.  /      1  1  1  1  1 

=    0    -         -  +  -    + + + -  + 


o,  —  a/       cij  —  ai       a,  —  a  j       o;  —  a}       a}  —  a,       «/  —  a,  J 
0-0  =  0.  (3.21) 

Using  the  results  from  (3.20),  and  (3.21)  in  (3.19).  we  have 

f(0)j-f{s)\,=o    =    -2  [trace  (A^A2l)  +  •••  +  trace  (i/.-U.)  +  trace  (A^Asi) 
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+  •  ■  •  +  trace  (ATk2Ak2)  +■■■  +  trace  (4-i^h-.)]  •        (3-22) 


But 


/(0)2     =     trace  (G*(0)TG'(0)) 

=    trace(nondiag(A)nondiag(A)) 

=     2  [trace  (a£A2i)  +  •  •  •  +  trace  (A^A^)  +  trace  (a12A32) 
+  •  •  •  +  trace  (Af2Afc2)  +  •  •  •  +  trace  (Aj^Au-i)    • 
Hence  (3.22)  yields  f(0)-ff(s)\s=o  =  -/(O)2-    So  f  (0)  =  -/(<>),  provided  /(0)  ^ 


ds 
0.    □ 


The  result  of  the  above  theorem  implies  that  Armijo's  rule  can  be  applied  to  the 
algorithm  in  (3.13). 

In  Theorem  3.2.1,  /'(0)  =  -/(0)  holds  only  when  A  is  symmetric.  So  we  need 
to  restore  the  symmetry  of  A  in  the  update  An™  =  (Y01*)'1  AMYM  in  (3.13).  If 
Y°ld  is  an  orthogonal  matrix,  then  Anew  will  be  symmetric.  In  general,  Y°  d  is  not 
an  orthogonal  matrix.  Let  Ynew  =  QnewRnew  be  a  QR  factorization  of  Ynew.  Now 
in  (3.13),  if  we  replace  A—  =  (Y°ld) ~*  A°ldY°ld  by  A""  =  (QoW)T  AoWQoW,  and 
v-ne™  _  Y°'dyneu'  by  Xnew  =  A'oWQneu\  then  Anew  will  retain  the  symmetry,  and 
Xnew  will  retain  the  orthogonality. 

A  sketch  of  an  algorithm  to  find  the  eigenvalues  of  a  symmetric  matrix  A  is  as 

follows: 

Algorithm  3.2.1  (Symmetric  Block  Diaaonalization)  Given  an  n  x  n  symmetric  ma- 
trix A,  an  orthogonal  matrix  X0,  a  tolerance  tol  greater  than  the  unit  roundoff,  and 
a  coalescing  tolerance  toll  greater  than  the  square  root  of  the  unit  roundoff,  the  fol- 
lowing algorithm  overwrites  A  with  A  =  XTAX,  where  A  is  a  block  diagonal  matrix, 
and  X  is  an  orthogonal  matrix. 
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Step  0:  Take  X  =  A0.  Compute  Anew  =  XTAoldX,  and  /(0)  =  ||  nondiag(/l)  ||F, 
where  nondiag(/l)  is  defined  as  earlier. 

Step  1:  Let  NB  —  n  (  the  number  of  diagonal  blocks  of  A  ),  NS  =  [  1,  1, ....  1  ]: 
an  array  of  n  elements  whose  ?th  element  is  the  dimension  of  the  ith  diagonal  block 
of  A.  For  i  <  j,  let  a,  be  the  eigenvalue  of  .4,,,  and  a,  be  the  eigenvalue  of  Ajj. 
If  \a,  —  a_,|  <  toll ,  then  merge  blocks  An  and  A21  to  form  a  single  block  using 
permutations  of  columns  and  rows  of  A.  Use  the  same  column  permutations  to  X  to 
merge  block  columns  A',  and  Xj.  Update  NBold,  and  NSold  to  get  NBnew,  and  iVS"™. 
Try  the  above  merging  procedure  for  all  possible  combinations  of  1  <  /  <  j  <  n. 
Using  column  and  row  permutations  arrange  the  blocks  of  A  such  that  the  diagonal 
blocks  appear  in  the  decreasing  order  of  sizes  along  the  diagonal.  Use  those  column 
permutations  to  arrange  block  columns  of  A  in  the  decreasing  order  of  sizes  as  well. 
Take  A  =  diag(  An,  A22-  ■  ■  ■ ,  ANBnb  )■ 

Step  2:  Construct  block  matrix  F  =  (Fu)  as  follows: 
lake  Fjj  =  0,  which  is  an  nij  x  m.j  zero  matrix,  and  for  i  ^  j,    F{j  can  be  determined 
by  using  the  following  loop. 

for  j  =  1  :  NB 
for  i  =  1  :  NB 

>f  i  ^  J 

Solve  ZAjj  —  A,,Z  =  A,j  for  Z. 

Fa  =  Z 
end 
end 
end 
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Step  3:    Let  f(s)    =    ||G(s)||F,  where  G(s)   =nondiag((7  +  sF)'1  A  {I  +  sF)). 


Evaluate  f(s)  at  s  =  1,-,-,...,  stopping  when 


/W  <  (i  -  f ) /(o). 


(3.23) 


Let  t  be  the  first  value  of  s  for  which  (3.23)  holds.  Let  Z  =  I  +  tF,  and  let  Z  =  QR 
be  a  QR  factorization  of  Z. 

Step  4  :  Update  A,  and  X  as  follows:  Anew  =  QTAoldQ,  and  Xnew  =  XoldQ, 
where  Q  is  the  orthogonal  matrix  from  Step  3. 

Step  5:  Compute  /(0)  =  ||nondiag  (A)  \\p,  and  goto  Step  1  until  /(0)  <  tol. 

Example  3.2.1  If  Algorithm  3.2.1  is  applied  to  the  matrix 


A  = 


with  Xq  =  I,  tol  =  10-6,  and  toll  =  10~4,  then  after  5  iterations  we  obtain  a 
diagonalization  XTAX  =  diag(  9,  9,  18  ).  If  we  use  the  QR  method  for  a  symmetric 
matrix  with  single  implicit  shift  to  A,  then  after  2  iterations  we  get  a  diagonalization 
S  AS  =  diag(  9,  18,  9  ).  Next  consider  the  following  matrix  Ai,  which  we  obtain  by 
perturbing  the  elements  of  A: 


13 

-4 

2  " 

-4 

13 

-2 

2 

-2 

10 

A,= 


13.04     -4.03       2.02 

-4.03      13.02     -2.01 

2.02     -2.01      10.03 


If  we  use  Algorithm  3.2.1  to  A1  with  X0  =  X,  tol  =  10-6,  and  toll  =  10-4,  then  after 
4  iterations  we  obtain  a  diagonalization  YT A\  Y  =  A.  If  we  apply  the  QR  method  for 
a  symmetric  matrix  with  single  implicit  shift  to  XTAiX,  then  also  after  4  iterations 
we  obtain  a  diagonalization  Q   (XT A\X)Q  =  D. 


Example  3.2.2  If  Algorithm  3.2.1  is  applied  to  the  matrix 
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B  =  Gdiag(2,2,2,3,3,3)<3T, 

whore  Q  is  an  orthogonal  matrix  with  A'0  =  J,  tol  =  10-6,  and  toll  —  10-4,  then 
after  10  iterations  we  obtain  a  diagonalization  XTBX  =  diag(3, 3, 3, 2, 2, 2),  wliereas 
if  we  use  the  QR  method  for  a  symmetric  matrix  with  single  implicit  shift  to  B,  then 
after  4  iterations  we  get  a  diagonalization  ST BS  —  diag(  3,  2,  3,  2,  2,  3  ). 

Exampli  3.2.3  If  Algorithm  3.2.1  is  applied  to  Rosser  matrix 


R 


611 

196 

-192 

407 

-8 

-52 

-49 

29 

196 

899 

113 

-192 

-71 

-43 

-S 

-44 

-192 

113 

899 

196 

61 

49 

8 

52 

407 

-192 

196 

611 

8 

44 

59 

-23 

-8 

-71 

61 

8 

411 

-599 

208 

208 

-52 

-43 

1') 

44 

-599 

111 

208 

208 

-49 

-8 

8 

59 

208 

208 

99 

-911 

29 

-41 

52 

-23 

208 

208 

-911 
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with  A'o  =  /,  tol  =  10  6,  and  toll  =  10  4,  then  after  10  iterations  we  obtain  a  di- 
agonalization AT/?A=diag(  1000,  1000.  1020,  0.09805. -1020.04902.  0.  1020.04902, 
1019.90195  ).  If  we  apply  the  QR  method  for  a  symmetric  matrix  with  single 
implicit  shift  to  R,  then  after  7  iterations  we  get  a  diagonalization  ST RS  =  diag( 
-1020.04902,  0.09805,  1000.  0,  1020.04902,  1020,  1019.90195,  1000  ).   Next  consider 

the  matrix 

96  15  -65  11  -98  -75  -77  11 

15  53  38  -50  -80  -37  -39  -75 

-65  38  12  99  96  20  62  40 

11  -50  99  89  1  36  73  -81 

-98  -80  96  1  10  -39  28  29 

-75  -37  20  36  -39  17  31  28 

-77  -39  62  73  28  31  69  -59 

11  -75  40  -81  29  28  -59  30 

If  we  apply  Algorithm  3.2.1  to  the  perturbed  matrix  l\\  =  /)'+  /'/lOOO  with  A'u  =  X, 
tol  =    10"''.  and    toll    =    10~4,   then   after  5   iterations  we  obtain   a  diagonalization 
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Y  R\Y  =  A,  whereas  if  we  apply  the  QR  method  for  a  symmetric  matrix  with 
single  implicit  shift  to  X  R\X,  then  after  8  iterations  we  get  a  diagonalization 
QT(XTR1X)Q  =  D. 

Example  3.2.4  If  Algorithm  3.2.1  is  applied  to  Wilkinson's  21  x  21  test  matrix 


10     1 
1      9     1 
1     8     1 
1     7     1 
1     6     1 
1     5    1 
1    4    1 
1    3    1 
1    2    1 
1     1     1 
1    0     1 
1     1     1 
1     2     1 
1     3    1 
1    4    1 
1     5     1 
1     6     1 
1     7    1 
1     8    1 
1    9      1 
1     10 


with  Xq  =  I,  tol  =  10  6,  and  toll  =  10  4,  then  after  10  iterations  we  obtain  a  diag- 
onalization XTWX  =  diag(  10.74619,  10.74619,  9.21068,  9.21068,  8.03894,  8.03894, 
7.00395,  7.00395,  6.00023,  6.00023,  5.00001,  5.00001,  3.99605,  4.00435,  3.04310, 
2.96106,  2.13021,  1.78932,  0.94753,  0.25381,  -1.12544  ),  whereas  if  we  use  the  QR 
method  for  a  symmetric  matrix  with  single  implicit  shift  to  W,  then  after  32  it- 
erations we  obtain  a  diagonalization  STWS  =  diag(  —1.12544,  0.25381,  0.94753, 
1.78932,  2.13021,  2.96106,  3.04310,  3.99605,  4.00435,  7.00395,  5.00024,  4.99978, 
6.00023,  6.00023,  7.00395,  8.03894,  8.03894,  9.21068,  9.21068,  10.74619,  10.74619  ). 
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Example  3.2.5  If  Algorithm  3.2.1  is  applied  to  Wilkinson's  7x7  test  matrix 


3    1 
1    2    1 

1    1    1 
1    0    1 
1    1    1 
1    2    1 
1    3 


with  A'o  =  /.  tol  =  10  6,  and  toll  =  10  4,  then  after  9  iterations  we  obtain  a 
diagonalization  XTWX  =  diag(  3.76156,  3.73205,  2.36333.  2,  1.  0.26795,  -1.12489), 
whereas  if  we  use  the  QH  method  for  a  symmetric  matrix  with  single  implicit  shift 
to  W,  then  after  12  iterations  we  get  a  diagonalization  STWS  =  diag(  —1.12489, 
0.26795,  1,  2,  2.36333,  0.76156,  3.73205  ). 

Example  3.2.6  If  Algorithm  3.2.1  is  applied  to  Wilkinson's  21  x  21  test  matrix 

Note  that,  if  at  least  two  diagonal  elements  of  the  given  matrix  A  are  equal,  and 
the  off  diagonal  elements  are  large  compared  to  the  diagonal  elements,  then  Algo- 
rithm 3.2.1  can  not  be  applied  directly  to  A.  In  this  case,  first  we  find  a  Hessenberg 
form  //  of  A,  and  then  apply  Algorithm  3.2.1  to  the  Hessenberg  form  //. 

If  we  apply  Algorithm  3.2.1  to  an  n  x  n  symmetric  matrix  A  with  A',,  =  /. 
a  tolerance  tol,  and  a  coalescing  tolerance  toll,  then  we  obtain  a  diagonalization 
XT AX  —  A.  Next  we  can  use  A'  as  the  starting  guess  orthogonal  matrix  to  obtain 
a  diagonalization  of  a  slightly  perturbed  symmetric  matrix  .4  +  eE  with  very  rapid 
convergence,  where  A'  is  an  arbitrary  symmetric  matrix,  and  (  is  a  small  scalar.  In 
the  following  table  for  differenl  values  of  n  we  give  the  number  of  iterations  required 
t<i  obtain  diagonalizations  of  A  +  eE  by  Algorithm  3.2.1  with  A'o  =  A.  and  diago- 
nal:/.it  ions  of  .V '  ( .  1  +  <  E)X  by  t  he  QK  met  hod  for  a  symmel  ric  matrix  with  single 
implicit  shift.    Mere  the  original  matrices  .4  =  (a,j)  are  random  symmetric  matrices. 
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where  a,ij  are  integers  and  0  <  atJ  <  20,  and  the  matrices  E  =  (etj)  are  random 
symmetric  matrices,  where  e,j  are  real  numbers  and  0  <  e,j  <  2.  The  tolerance  is 
tol  —  10-6,  and  the  coalescing  tolerance  is  toll  —  10-4: 


size 

number  of  iterations 
for  our  algorithm 

number  of  iterations  for 
the  QR  algorithm  with  single  shift 

e 

A  +  eE 

A  +  eE 

ST{A  +  eE)S 

10 

3 

16 

15 

ID"2 

20 

3 

33 

32 

10-2 

30 

3 

50 

50 

ID"2 

40 

3 

64 

64 

10-2 

50 

3 

79 

80 

10-2 

CHAPTER  4 
CONCLUSION 

In  the  introduction,  we  reviewed  the  application  of  eigenvalue  sensitivity,  and 
eigenvector  sensitivity  to  derive  an  iterative  method  to  find  all  eigenvalues,  and  cor- 
responding eigenvectors  of  a  matrix  .4,  whose  eigenvalues  are  all  distinct.  To  use 
this  method,  we  need  a  matrix  of  approximate  eigenvectors  Xq.  If  Ao  is  not  a  good 
approximation  of  X  in  the  factorization  A  =  XAX-1,  where  A  is  a  diagonal  matrix, 
then  the  method  diverges.  As  discussed  in  Section  2.1,  the  application  of  eigenval- 
uesensitivity,  and  eigenvector  sensitivity  can  be  theoretically  extended  to  derive  an 
iterative  method  to  find  all  eigenvalues,  and  corresponding  eigenvectors  of  a  nondefec- 
tive  matrix  A,  some  of  whose  eigenvalues  may  be  identical.  The  obvious  difficulty  one 
encounters  in  using  this  method  is  how  to  choose  a  matrix  of  approximate  eigenvec- 
tors for  the  starting  guess.  In  Section  2.4,  we  showed  that  the  block  diagonalization 
method  converges  locally  quadrat ically.  If  by  another  method,  a  matrix  of  eigen- 
vectors X  of  a  matrix  A  can  be  determined  (  in  Section  2.1,  we  applied  the  QR 
method  with  double  implicit  shift  to  obtain  the  eigenvalues,  and  then  solve  Bx  —  0 
for  x,  where  D  =  A  —  A,/  to  obtain  a  corresponding  eigenvector  to  the  eigenvalue 
A,,  i  =  1,  ...,  n.  To  solve  Bx  =  0  for  x,  we  assume  x(i)  =  1.  move  the  ith  column 
of  B  to  the  right  hand  side  of  the  equal  sign,  and  then  solve  an  overdetennined  sys- 
tem of  equations  by  the  QR  factorization  technique,  )  then  the  block  diagonalization 
method  can  he  used  to  find  the  eigenvalues  of  a  slightly  perturbed  matrix  .1  +  iE 
with  A'  as  the  stalling  guess  matrix  of  eigenvectors  with  rapid  convergence,  where  I:' 
is  .in  arbitrary  matrix,  ami  <   is  a  small  scalar. 
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In  the  Differential  Equation  Appoach  to  Eigencomputations,  we  use  the  Euler 
method.  The  differential  equations  that  we  solve  for  the  eigenproblem  are: 

A(t)  =  -A(t)  +  Diag  (X(t)-lAX(t))  ,  and  X(t)  =  X(t)F  {X(t),  A(t)) , 

where  Diag(M)  is  a  block  diagonal  matrix  formed  from  the  diagonal  blocks  of  the 
block  matrix  M,  F3J  {X(t),  A{t))  is  a  square  zero  matrix,  and  FtJ  (X(t),  A(t))  for  i  ^  j 
is  the  solution  B  to  the  matrix  equation  BA3(t)- \i{t)B  =  Yi(t)TAXj(t).  Here  Yt{t)T 
denotes  the  zth  block  row  of  X(<)-1.  The  iterates  to  update  A,  and  X  are: 

An+i(<)  =  A„  +  At  (Diag  (X^AXn)  -  An)  ,  and  Xn+1(t)  =  Xn  +  AtXnF(Xn,An) . 

This  approach  works  only  for  small  matrices  (  Example  2.2.1,  and  2.2.2  ).  If  the 
size  of  a  matrix  is  large,  then  to  achieve  the  convergence  in  the  results,  we  need  to 
take  a  small  stepsize.  However  with  a  small  stepsize,  too  many  iterations  are  required 
to  obtain  a  few  digits  accuracy  in  the  results  (  Example  2.2.3  ). 

In  the  iterative  methods,  discussed  in  Section  2.3,  and  2.6,  the  results  do  not  start 
converging  until  we  finish  almost  one  half  of  the  total  iterations  required.  So  in  these 
iterative  methods  to  avoid  numerical  instability,  we  need  a  small  stepsize  in  each 
iteration  at  the  beginning,  whereas  after  the  results  start  converging,  we  can  take  a 
large  stepsize  in  each  iteration  to  speed  up  the  convergence  in  the  results.  Hence  in 
the  Differential  Equation  Approach  to  Eigencomputations  with  Armijo's  Stepsize,  we 
vary  stepsize  in  each  iteration.  It  turns  out  that  this  method  does  not  work  for  all 
matrices.  For  example,  if  a  matrix  has  multiple  eigenvalues,  or  the  size  of  the  matrix  is 
large,  then  the  method  fails  (  Example  2.3.2  ).  In  all  of  these  cases,  two  1  x  1  diagonal 
blocks  approach  each  other,  but  as  the  iterations  continue,  the  rate  at  which  they 
approach  each  other  gets  slower  and  slower.  However  if  by  another  method,  a  block 
diagonalization  A  =  A'AA'-1  can  be  determined  (  for  example,  using  the  QR  method 
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with  double  implicit  shift  to  .4,  we  can  find  a  real  Schur  form  .4  =  QRQT.  Then 
applying  Algorithm  2.2.1  to  R,  we  can  find  a  block  diagonalization  R  =  PAP-1. 
Hence  A  =  .YAA"-1  is  the  block  diagonalization  of  A,  where  X  =  QP,  )  then  X  can 
be  used  as  the  starting  guess  invertible  matrix,  and  A  can  be  used  as  the  starting 
guess  block  diagonal  matrix  to  obtain  a  block  diagonalization  of  a  perturbed  matrix 
A  +  eE  with  very  rapid  convergence,  where  E  is  an  arbitrary  matrix,  and  e  is  a  small 
number. 

Since  all  matrices  A  are  not  diagonalizable.  so  in  Section  2.5,  we  discuss  the  fac- 
tors sensitivity  in  the  Schur  decomposition  relative  to  perturbations  in  the  coefficient 
matrix  to  compute  a  block  Schur  decomposition  of  a  matrix  A.  In  Section  2.6.  we 
present  an  algorithm  to  compute  a  block  Schur  decomposition  of  a  matrix  .4.  This 
algorithm  is  based  on  sensitivity  results  for  the  factors  in  the  Schur  decomposition 
relative  to  perturbations  in  the  coefficient  matrix,  and  Armijo's  rule  from  optimiza- 
tion theory.  Here  one  needs  a  starting  guess  unitary  matrix  5'0,  and  a  starting  guess 
upper  triangular  matrix  t'0,  whose  diagonal  entries  approximate  the  eigenvalues  of 
the  matrix  A.  If  we  take  So  —  I,  and  a  diagonal  matrix,  whose  diagonal  elements 
are  uniformly  distributed  points  on  a  circle  in  the  complex  plane,  which  includes  all 
Gerschgorin  disks  for  .4  as  Uq,  then  the  algorithm  works  for  all  matrices.  Once  a 
block  Schur  decomposition  A  =  QUQ"  is  computed,  then  this  algorithm  can  be  used 
to  find  a  block  Schur  decomposition  of  a  slightly  perturbed  matrix  4  +  eE  with  V 
as  the  starting  guess  upper  triangular  matrix,  and  Q  as  the  starting  guess  unitary 
matrix  with  very  rapid  convergence,  where  E  is  an  arbitrary  matrix,  and  <  is  a  small 
number.  In  Section  2.7.  we  present  some  node  codes  which,  along  with  some  modifi- 
cations to  Algorithm  2.6.1,  can  be  used  in  a  parallel  computer,  where  processors  are 
set  up  in  a  ring. 
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In  Chapter  3,  we  discussed  how  to  exploit  the  symmetry  of  a  matrix  in  the  di- 
agonalization  process.  In  Section  3.1,  we  showed  how  to  modify  the  diagonalization 
method,  discussed  in  Chapter  1.  We  introduced  Armijo's  stepsize  in  the  modified 
method  to  make  it  more  practical.  This  method  works  for  all  symmetric  matrices 
with  distinct  eigenvalues.  In  Section  3.2,  we  also  modified  the  block  diagonalization 
method,  which  is  discussed  in  Section  2.1.  Again,  Armijo's  stepsize  is  introduced  to 
make  the  method  practical.  If  at  least  two  diagonal  elements  of  a  symmetric  matrix 
A  are  equal,  and  the  off  diagonal  elements  are  large  compared  to  the  diagonal  ele- 
ments, then  we  can  not  apply  this  method  directly  to  A.  In  this  case  first,  we  reduce 
A  to  a  symmetric  tridiagonal  matrix  H  using  the  Hessenberg  reduction  to  A,  and 
then  apply  the  algorithm  to  H. 
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